Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Wanted - Anywhere - 1vcpu - 1G ram - 50-60TB storage - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Wanted - Anywhere - 1vcpu - 1G ram - 50-60TB storage

2»

Comments

  • AXYZEAXYZE Member

    @emgh said:

    @AXYZE said:
    @emgh So Backblaze splits chunks of data across servers. Data itself isnt replicated, lost chunks can be rebuilt like in RAID-5.

    Althrough this architecture can be durable, it doesn't fixes issues that replication solves - disasters like fire, fiber cuts etc.

    So it is just redundant, not replicated. :)
    Replicated at this price would be absolutely amazing.

    1. Replication dosen’t mean geographically split though, does it? Although I agree it’s not replicated, I was wrong, isn’t the correct definition a file split between two systems? For example, if AWS uses the exact same logic but on different avsliability zones, is that really replicated?
    2. In your personal opinion, even considering it’s technically not replicated, is AWS to be considered replicated if I’m right in the sense that it’s split between avaliability zones?
    3. Do you think these ”tomes” or whatever they were called are integrated in the sense that the whole thing can be corrupted, like RAID?

    I guess not even S3 with it’s ”replication across zones” isn’t even replication, but I can’t really explain why not.

    Still, combining any of the above with a production envirement running in a whole other continent with backups in place, equals replication I guess.

    Replication = data is being cloned 1:1 to another place.
    In case of Backblaze there is no cloning, data is just divided across servers with extra bits for repairing (like RAID-5).
    Cloning data across AZ or different regions is just added benefit.

    If you have replicated data and encounter problems on one server then 100% of your data is available at another server immediately.
    That is not the case with erasure coding / Backblaze - data needs to be rebuild. By using so many servers for one object they eliminated worst problems of this method - if one server corrupts all data then it affects just 5% of object (1 of 20 servers is 5%) so rebuild times for affected files won't be that crazy and majority of chunks are still available on other servers so it is possible that only thing that people will notice is a slowdown during download and that's it. Also, they likely have thousands of servers so such failure will affect less than 1% files.

    It's not like they didn't have failed disks, they just have enough scale that it won't be noticeable when they do fail... but replication of files would make their service even better (better speeds, better durability). Also, possibly greatest reason why they didn't lose files yet is because they can predict when disks will fail and put spare disks into system - they have years of experience.

    So yea, althrough Backblaze is durable, its nowhere near replication. It is durable because they greatly reduced cons of traditional parity.

  • I use both AWS S3 Intelligent Tiering and Wasabi. AWS S3 Intelligent Tiering is great for its reliability, and flexibility so it will charge you a little more if there are specific files/objects you access more and move the less frequently accessed files to archive automatically.

    Wasabi is great if you are constantly transferring large files back and forth because it has no egress fees.

  • tjntjn Member

    Just thought I'd add a little FYI - B2 does have a replication feature.
    Obvioubly you pay for the replicated space used.

    https://help.backblaze.com/hc/en-us/articles/5206152893467-Creating-a-Cloud-Replication-Rule

Sign In or Register to comment.