Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


50-100TB video storage cluster - Page 3
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

50-100TB video storage cluster

13»

Comments

  • AXYZEAXYZE Member
    edited July 2022

    @Hxxx said:
    If you want HA you would be looking at other solutions or a custom one. For instance you could have some sort of load balancer in front and assuming that you have two identical setups. If one file failed to load from server A, retry with Server B.

    Load balancing is not a problem in this case, Cloudflare will take care of it and point to correct server. I think I can mount such 'backup' folder 24/7 on each server so even if one server fails then another one can read same data and if it is not available (for example if rsync didnt transfer before failure) then it will forward information that video is "temporaly unavailable" to frontend.

    Frontend is on completly different server (Hetzner AX41), these that I'm discussing here are only for storage, nothing more. :)

    @letlover said:
    If one or two disks fail in zfs, how do you recover the whole file system?

    I'll let you know when they fail hahahha

  • HxxxHxxx Member

    @AXYZE i get it but my reply was more in regards to that storage system you are working on. You can have an internal balancer or redirect system that try to fetch i.e a video file, if it fails to get it from server A then retry from server B, in such case you are creating some sort of HA for the storage files. That combined with RAID on each server, or ZFS could be a good combo. It just might be expensive in the part that:
    -You would have to develop that internal load balancer or customize one.
    -Invest in more hardware for more capacity since you are technically mirroring complete server files between A and B.

    Thanked by 1AXYZE
  • cobrahcobrah Member
    Thanked by 1AXYZE
  • @cobrah said: What about the storj?

    https://www.storj.io/solutions/video-streaming

    Sounds like another company that might decide to deadpool and give you a month notice.
    Aren't you guys fed up with all those cheap storage startups.

    Thanked by 1AXYZE
  • dfroedfroe Member, Host Rep

    @AXYZE said:
    RAID-Z2 still gives me problem when HDD fails

    With RAIDz2 each single block (default up to 128KB) may be unreadable on any two disks and your data is still safe.
    If one disk completely fails, you can still have block 1 unreadable on disks 1 and 3 and block 2 failing on disks 2 and 4 during rebuild (for example).

    Especially with larger disks this is a huge advantage compared to RAID.
    Furthermore regular scrubs will detect defective sectors.

    This all makes a RAIDz2 pool very resilient and it can survive more failure scenarios compared to RAID-6 (or RAID-10 which equals a striped mirror in ZFS).
    RAID-10 (or striped mirror) will have better write performance (which won't matter in your case) but slightly less resilience.

    Finally you need a backup anyways on some completely different system (in case your datacenter burns down). RAID(z) is only to keep service and uptime high during particular hardware failures and increase performance under certain conditions. And to achieve this a RAIDz2 offers great value.

    Thanked by 2RapToN AXYZE
  • AXYZEAXYZE Member
    edited July 2022

    $4/1TB for storage and $7/1TB for bandwidth
    On top of that rate limits, unpredictable performance.

    If I would want to do store it in cloud I would choose Backblaze + Cloudflare combo, its more cost efficient and reliable.

    But here I try to do it for less than $2/mo storage + almost 1PB already included in price (currently 3x 1Gbps "unlimited" servers on Hetzner).
    I need to rethink couple of things tho, 3Gbps will not be enough, maybe I'll cancel these servers and order smaller ones. 10Gbps unmetered will cost like $1000 alone without calculating traffic, I can easily get more than 10 1Gbps servers under that price.
    I need to recalculate and benchmark with Cloudflare Load Balancer how well it will scale with many servers.

    Any experiences guys with distibuted file systems like MinIO or SeaweedFS? MinIO looks like ideal choice to me, but I'm not sure how it scales with such big files, SeaweedFS is not recommended for big files from what I heard.

    Edit: Or maybe stay with these big servers but add "Cloud instaces" to the list which would cache most recent / most viewed files - it's just 2.99euro for 10Gbps 20TB that way, it can easily offload a lot when a lot of people would watch the same video at the same time.
    When video popularity would drop (after 8 hours for example) then instance would be deleted (so I would pay just 4 cents to serve popular video at up to 10Gbps for 8 hours!!).
    Instances would be dynamically created/scaled when there would be need, so overall it would cost lets say 50 cents per day. 15 euro per month and bandwidth problem solved?
    Seems awesome to me, but obviously Im missing something, it cant be that cheap to solve problem xD

    Now that I think about it - if I would have just one cloud instance and I would delete it every day then it would be 20TB bandwidth daily so 600TB monthly... for 2.99euro? wtf?
    There needs to be some gotcha

  • AXYZEAXYZE Member
    edited July 2022

    @dfroe said:

    @AXYZE said:
    RAID-Z2 still gives me problem when HDD fails

    With RAIDz2 each single block (default up to 128KB) may be unreadable on any two disks and your data is still safe.
    If one disk completely fails, you can still have block 1 unreadable on disks 1 and 3 and block 2 failing on disks 2 and 4 during rebuild (for example).

    Especially with larger disks this is a huge advantage compared to RAID.
    Furthermore regular scrubs will detect defective sectors.

    This all makes a RAIDz2 pool very resilient and it can survive more failure scenarios compared to RAID-6 (or RAID-10 which equals a striped mirror in ZFS).
    RAID-10 (or striped mirror) will have better write performance (which won't matter in your case) but slightly less resilience.

    Finally you need a backup anyways on some completely different system (in case your datacenter burns down). RAID(z) is only to keep service and uptime high during particular hardware failures and increase performance under certain conditions. And to achieve this a RAIDz2 offers great value.

    Thanks for info!

    Finally you need a backup anyways on some completely different system (in case your datacenter burns down).

    Thats why I'm thinking about ignoring RAID and just cloning data from one server to another - if there's sudden power surge, some problem with cabling or whatever and whole server fails then there's second server with same data. Even if there's fire problem it MAY help, because servers can be in different buildings (and I specifically bought servers that are all in different buildings) - even in extreme OVH case whole datacenter location didnt burn, only SBG2.
    Of course there can be major incident, big nuclear explosion or whatever, but I think distributing data across different servers will be way better than RAID. Happy to hear your suggestions!

    Let say I have 4 nodes (all of them 10x10TB)
    Cloning without RAID between two servers gives me 50% usable space, one of two servers can fail so overall 1-2 server can fail when I have 4 of them.
    MinIO with erasure coding gives me 75% usable space, 1 server can fail, 2 out of 8 drives can fail (stripe size 8, parity 2).

  • 0xbkt0xbkt Member

    How about Storage Share + Cloud instances? NX31 gets you 1 TB at $2.55.

    Thanked by 1AXYZE
  • eva2000eva2000 Veteran
    edited July 2022

    @AXYZE said: I need to rethink couple of things tho, 3Gbps will not be enough, maybe I'll cancel these servers and order smaller ones. 10Gbps unmetered will cost like $1000 alone without calculating traffic, I can easily get more than 10 1Gbps servers under that price.

    Once you start scaling in terms of bandwidth capacity + accounting for redundancy + high availability, it's probably cheaper to use something like Cloudflare Stream as bandwidth is free and you're only paying for storage and minutes viewed.

    Thanked by 1AXYZE
  • AXYZEAXYZE Member
    edited July 2022

    @eva2000 said:

    @AXYZE said: I need to rethink couple of things tho, 3Gbps will not be enough, maybe I'll cancel these servers and order smaller ones. 10Gbps unmetered will cost like $1000 alone without calculating traffic, I can easily get more than 10 1Gbps servers under that price.

    Once you start scaling in terms of bandwidth capacity + accounting for redundancy + high availability, it's probably cheaper to use something like Cloudflare Stream as bandwidth is free and you're only paying for storage and minutes viewed.

    Sadly, its most expensive option of them all.
    It was my first guess at start.

    $1 per 1,000 minutes delivered no matter the quality.

    We will need 2Mbps per stream average from our early calculation and testing (VP9 does wonders at low bitrates)
    1Gbps is suitable for up to 500 streams at once.

    500 streams at CF Stream will cost $0.5 per minute so $30 for hour. Even if its just an hour daily then whole month is $900. For 1Gbps equivalent and I didnt even calculated storage yet... and remember, its only one hour per day.
    If we would go there fully then it will easily go to like $5k-$10k.

    This is good solution for some internal things in corporations where simplicity is important and wasted time means wasted big money. Here we build app for fun, it is at least 10x more cost effective to selfhost that and if everything will go fine we will have own european architecture + very low operational costs.

  • eva2000eva2000 Veteran
    edited July 2022

    @AXYZE said: 500 streams at CF Stream will cost $0.5 per minute so $30 for hour. Even if its just an hour daily then whole month is $900. For 1Gbps equivalent and I didnt even calculated storage yet... and remember, its only one hour per day.
    If we would go there fully then it will easily go to like $5k-$10k.

    I see how that would add up. What about Bunny.net's video streaming?

    Thanked by 1AXYZE
  • AXYZEAXYZE Member

    @eva2000 said:

    @AXYZE said: 500 streams at CF Stream will cost $0.5 per minute so $30 for hour. Even if its just an hour daily then whole month is $900. For 1Gbps equivalent and I didnt even calculated storage yet... and remember, its only one hour per day.
    If we would go there fully then it will easily go to like $5k-$10k.

    I see how that would add up. What about Bunny.net's video streaming?

    Wow, now thats very good price!
    Thanks very much for the link, I will test it out!

    I wouldnt go with bunny alone, because of storage price (20$/TB) but I already have two ideas how to combine it with Hetzner servers to have best of two words (bunny has free encoding!). :)

    By the way thanks for CentMinMod, it was first tool that I used after I decided to ditch panels like cPanel. Tweaking nginx config, cloudflare zlib etc. has learned me a lot how I can get most out of cheap VPSes and how it all works :)

  • AXYZEAXYZE Member

    @0xbkt said:
    How about Storage Share + Cloud instances? NX31 gets you 1 TB at $2.55.

    Looks neat but Im worried about performance, heard bad things about it. I will test it out :)

Sign In or Register to comment.