Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


What is the best technology choice for building a large-scale local storage solution?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

What is the best technology choice for building a large-scale local storage solution?

I am working on a non-serious project that requires storing a large amount of video resources. The estimated scale is between 500TB-1PB, and I hope to keep maintenance costs as low as possible. Therefore, a single-point solution is acceptable, and a cluster solution is not necessary.

I have some alternative solutions, which one is better? Can you share your experience with me?

Solution 1: Riad-Z2/Z3 based on Truenas, with solid-state hard drives as metadata cache.
Solution 2: Build on bare metal using Minio.
Solution 3: Ceph or similar.
Solution 4: Traditional software RAID with bcache.

I lack experience in managing large local storage, so it is difficult for me to choose. I look forward to some valuable community sharing.

Comments

  • emghemgh Member

    No raid

    Nothing

    SFTP to some hetzner dedis

    Cheapest :)

    Thanked by 2AndreiGhesi greentea
  • What is your budget? Would using a NAS like Synology be cheaper?

  • emghemgh Member

    @chihcherng said:
    Would using a NAS like Synology be cheaper?

    I guess that simply depends on the longevity.

  • @chihcherng said:
    What is your budget? Would using a NAS like Synology be cheaper?

    The answer is negative. Synology's hardware is known for being expensive. I have a NAS based on the Unraid system at home with a capacity of 100T. However, I am discussing hosting or renting from an IDC service provider. Maintaining large local storage in a home environment is not cost-effective.

  • ZreindZreind Member

    we don't know what should ve longevity of the videos but i think renting storage dedis or using aws (or aws alike s3 service) glacier deep storage 1$/TB if you don't need to access it continuously

  • ZreindZreind Member

    @Zreind said:
    we don't know what should ve longevity of the videos but i think renting storage dedis or using aws (or aws alike s3 service) glacier deep storage 1$/TB if you don't need to access it continuously

    *will be cheaper

  • @Zreind said:
    we don't know what should ve longevity of the videos but i think renting storage dedis or using aws (or aws alike s3 service) glacier deep storage 1$/TB if you don't need to access it continuously

    Yes, but considering redundancy, I need to further build a ZFS array or something like Minio erasure coding on top of it. This is quite a headache.

  • FatGrizzlyFatGrizzly Member, Host Rep

    you should ask @atmwebhost!

    He had ideas of setting up servers in moon, cyber attacking all datacentres down with a high spec machine with AI robots in a DC

    Thanked by 1emgh
  • defaultdefault Veteran

    @FatGrizzly said:
    you should ask @atmwebhost!

    He had ideas of setting up servers in moon, cyber attacking all datacentres down with a high spec machine with AI robots in a DC

    Or storage in the city of Atlantis from @yoursunny

  • somiksomik Member

    @danblaze 3.7PB array managed by single head server, all connected over local network

  • @somik said:
    @danblaze 3.7PB array managed by single head server, all connected over local network

    A ultimate solution, I have watched the video and concluded that I may not be able to obtain it >:)

  • somiksomik Member

    @danblaze said:

    @somik said:
    @danblaze 3.7PB array managed by single head server, all connected over local network

    A ultimate solution, I have watched the video and concluded that I may not be able to obtain it >:)

    Lolz, it was meant to give you some idea on what others are doing to setup custom local network based NAS type storage media. Since you feel this will be too much (which I agree with fully), I feel that it is best to not have this type of storage locally available, but engage a third party to store your data. AWS, GoogleCloud, OracleCloud, all offers large scale storage.

  • edited May 2023

    Nice video

  • eva2000eva2000 Veteran
    edited May 2023

    Also check out JuiceFS https://juicefs.com/docs/community/introduction/ with Cloudflare R2 storage. Did a write up and benchmarks at https://github.com/centminmod/centminmod-juicefs which can give you an idea

    JuiceFS implements an architecture that seperates "data" and "metadata" storage. When using JuiceFS to store data, the data itself is persisted in object storage (e.g., Amazon S3, OpenStack Swift, Ceph, Azure Blob or MinIO), and the corresponding metadata can be persisted in various databases (Metadata Engines) such as Redis, Amazon MemoryDB, MariaDB, MySQL, TiKV, etcd, SQLite, KeyDB, PostgreSQL, BadgerDB, or FoundationDB.

    df -hT /home/juicefs_mount
    Filesystem        Type          Size  Used Avail Use% Mounted on
    JuiceFS:myjuicefs fuse.juicefs  1.0P  4.0K  1.0P   1% /home/juicefs_mount
    
    Thanked by 3akhfa danblaze mrTom
  • @danblaze said:

    @chihcherng said:
    What is your budget? Would using a NAS like Synology be cheaper?

    The answer is negative. Synology's hardware is known for being expensive. I have a NAS based on the Unraid system at home with a capacity of 100T. However, I am discussing hosting or renting from an IDC service provider. Maintaining large local storage in a home environment is not cost-effective.

    Really? QNAP is generally cheaper than Synology but far inferior. Making a hot swap equivalent of a Synology for less is very difficult.

    Plus, you'd save labour from having to roll your own, which can be time and money saving.

  • eva2000eva2000 Veteran
    edited May 2023

    FYI, setup JuiceFS with Cloudflare R2 s3 object storage on my other server which has 2x 960GB NVMe raid 1.

    JuiceFS allows you to shard the R2 buckets for storage for better performance which seems to have helped a bit for big file reads and for 1MB big file writes at least :) Though still relatively slower PUT/GET object latencies from my Dallas server due to R2 locations available. But still adequate for my needs so far :D

    The table below shows comparison between 10x Cloudflare R2 sharded JuiceFS mount vs 5x Cloudflare R2 sharded JuiceFS mount vs 1x Cloudflare JuiceFS mount (default). All R2 storage locations are with location hint North American East.

    For 1024MB big file size

    ITEM VALUE (10x R2 Sharded) COST (10x R2 Sharded) VALUE (5x R2 Sharded) COST (5x R2 Sharded) VALUE (1x R2 Default) COST (1x R2 Default)
    Write big file 906.04 MiB/s 4.52 s/file 960.47 MiB/s 4.26 s/file 1374.08 MiB/s 2.98 s/file
    Read big file 223.19 MiB/s 18.35 s/file 174.17 MiB/s 23.52 s/file 152.23 MiB/s 26.91 s/file
    Write small file 701.2 files/s 5.70 ms/file 777.4 files/s 5.15 ms/file 780.3 files/s 5.13 ms/file
    Read small file 6378.3 files/s 0.63 ms/file 7940.0 files/s 0.50 ms/file 8000.9 files/s 0.50 ms/file
    Stat file 21123.7 files/s 0.19 ms/file 29344.7 files/s 0.14 ms/file 27902.2 files/s 0.14 ms/file
    FUSE operation 71555 operations 2.16 ms/op 71597 operations 2.67 ms/op 71649 operations 3.06 ms/op
    Update meta 6271 operations 9.01 ms/op 6041 operations 4.09 ms/op 6057 operations 2.50 ms/op
    Put object 1152 operations 403.23 ms/op 1136 operations 428.27 ms/op 1106 operations 547.32 ms/op
    Get object 1034 operations 278.61 ms/op 1049 operations 299.50 ms/op 1030 operations 301.80 ms/op
    Delete object 316 operations 124.32 ms/op 60 operations 120.73 ms/op 29 operations 234.02 ms/op
    Write into cache 1424 operations 24.92 ms/op 1424 operations 83.12 ms/op 1424 operations 12.91 ms/op
    Read from cache 400 operations 0.05 ms/op 400 operations 0.05 ms/op 400 operations 0.04 ms/op

    For 1MB big file size

    ITEM VALUE (10x R2 Sharded) COST (10x R2 Sharded) VALUE (5x R2 Sharded) COST (5x R2 Sharded) VALUE (1x R2 Default) COST (1x R2 Default)
    Write big file 452.66 MiB/s 0.01 s/file 448.20 MiB/s 0.01 s/file 230.82 MiB/s 0.02 s/file
    Read big file 1545.95 MiB/s 0.00 s/file 1376.38 MiB/s 0.00 s/file 1276.38 MiB/s 0.00 s/file
    Write small file 682.8 files/s 5.86 ms/file 792.5 files/s 5.05 ms/file 675.7 files/s 5.92 ms/file
    Read small file 6299.4 files/s 0.63 ms/file 7827.1 files/s 0.51 ms/file 7833.1 files/s 0.51 ms/file
    Stat file 21365.2 files/s 0.19 ms/file 24308.1 files/s 0.16 ms/file 28226.1 files/s 0.14 ms/file
    FUSE operation 5757 operations 0.42 ms/op 5750 operations 0.38 ms/op 5756 operations 0.41 ms/op
    Update meta 5814 operations 0.72 ms/op 5740 operations 0.74 ms/op 5770 operations 0.70 ms/op
    Put object 107 operations 282.68 ms/op 94 operations 286.35 ms/op 118 operations 242.35 ms/op
    Get object 0 operations 0.00 ms/op 0 operations 0.00 ms/op 0 operations 0.00 ms/op
    Delete object 133 operations 116.84 ms/op 59 operations 117.93 ms/op 95 operations 83.94 ms/op
    Write into cache 404 operations 0.12 ms/op 404 operations 0.12 ms/op 404 operations 0.14 ms/op
    Read from cache 408 operations 0.06 ms/op 408 operations 0.05 ms/op 408 operations 0.06 ms/op
  • alt_alt_ Member

    minio?

    Thanked by 1BasToTheMax
  • PureVoltagePureVoltage Member, Patron Provider

    Personally multiple servers will keep the costs down and accessing it from a few high powered head servers is more ideal.
    Epically if you plan on scaling this larger over time.

    Scaling over multiple servers can also allow you to go with some smaller drive options while giving less disk space can reduce the costs in the long run.

  • ShazanShazan Member, Host Rep

    GlusterFS?

  • @eva2000 said:
    FYI, setup JuiceFS with Cloudflare R2 s3 object storage on my other server which has 2x 960GB NVMe raid 1.

    JuiceFS allows you to shard the R2 buckets for storage for better performance which seems to have helped a bit for big file reads and for 1MB big file writes at least :) Though still relatively slower PUT/GET object latencies from my Dallas server due to R2 locations available. But still adequate for my needs so far :D

    The table below shows comparison between 10x Cloudflare R2 sharded JuiceFS mount vs 5x Cloudflare R2 sharded JuiceFS mount vs 1x Cloudflare JuiceFS mount (default). All R2 storage locations are with location hint North American East.

    For 1024MB big file size

    ITEM VALUE (10x R2 Sharded) COST (10x R2 Sharded) VALUE (5x R2 Sharded) COST (5x R2 Sharded) VALUE (1x R2 Default) COST (1x R2 Default)
    Write big file 906.04 MiB/s 4.52 s/file 960.47 MiB/s 4.26 s/file 1374.08 MiB/s 2.98 s/file
    Read big file 223.19 MiB/s 18.35 s/file 174.17 MiB/s 23.52 s/file 152.23 MiB/s 26.91 s/file
    Write small file 701.2 files/s 5.70 ms/file 777.4 files/s 5.15 ms/file 780.3 files/s 5.13 ms/file
    Read small file 6378.3 files/s 0.63 ms/file 7940.0 files/s 0.50 ms/file 8000.9 files/s 0.50 ms/file
    Stat file 21123.7 files/s 0.19 ms/file 29344.7 files/s 0.14 ms/file 27902.2 files/s 0.14 ms/file
    FUSE operation 71555 operations 2.16 ms/op 71597 operations 2.67 ms/op 71649 operations 3.06 ms/op
    Update meta 6271 operations 9.01 ms/op 6041 operations 4.09 ms/op 6057 operations 2.50 ms/op
    Put object 1152 operations 403.23 ms/op 1136 operations 428.27 ms/op 1106 operations 547.32 ms/op
    Get object 1034 operations 278.61 ms/op 1049 operations 299.50 ms/op 1030 operations 301.80 ms/op
    Delete object 316 operations 124.32 ms/op 60 operations 120.73 ms/op 29 operations 234.02 ms/op
    Write into cache 1424 operations 24.92 ms/op 1424 operations 83.12 ms/op 1424 operations 12.91 ms/op
    Read from cache 400 operations 0.05 ms/op 400 operations 0.05 ms/op 400 operations 0.04 ms/op

    For 1MB big file size

    ITEM VALUE (10x R2 Sharded) COST (10x R2 Sharded) VALUE (5x R2 Sharded) COST (5x R2 Sharded) VALUE (1x R2 Default) COST (1x R2 Default)
    Write big file 452.66 MiB/s 0.01 s/file 448.20 MiB/s 0.01 s/file 230.82 MiB/s 0.02 s/file
    Read big file 1545.95 MiB/s 0.00 s/file 1376.38 MiB/s 0.00 s/file 1276.38 MiB/s 0.00 s/file
    Write small file 682.8 files/s 5.86 ms/file 792.5 files/s 5.05 ms/file 675.7 files/s 5.92 ms/file
    Read small file 6299.4 files/s 0.63 ms/file 7827.1 files/s 0.51 ms/file 7833.1 files/s 0.51 ms/file
    Stat file 21365.2 files/s 0.19 ms/file 24308.1 files/s 0.16 ms/file 28226.1 files/s 0.14 ms/file
    FUSE operation 5757 operations 0.42 ms/op 5750 operations 0.38 ms/op 5756 operations 0.41 ms/op
    Update meta 5814 operations 0.72 ms/op 5740 operations 0.74 ms/op 5770 operations 0.70 ms/op
    Put object 107 operations 282.68 ms/op 94 operations 286.35 ms/op 118 operations 242.35 ms/op
    Get object 0 operations 0.00 ms/op 0 operations 0.00 ms/op 0 operations 0.00 ms/op
    Delete object 133 operations 116.84 ms/op 59 operations 117.93 ms/op 95 operations 83.94 ms/op
    Write into cache 404 operations 0.12 ms/op 404 operations 0.12 ms/op 404 operations 0.14 ms/op
    Read from cache 408 operations 0.06 ms/op 408 operations 0.05 ms/op 408 operations 0.06 ms/op

    Wow, a very meaningful test, thank you.

  • eva2000eva2000 Veteran
    edited May 2023

    @danblaze said: Wow, a very meaningful test, thank you.

    You're welcome. Been using JuiceFS with Cloudflare R2 for over 1 year now and loving it :)

    I also added JuiceFS Benchmarks 10x R2 Sharded Mount + Redis Metadata Caching

    Default 1MB big file.

    ITEM VALUE (10x R2 Sharded + Redis) COST (10x R2 Sharded + Redis) VALUE (10x R2 Sharded) COST (10x R2 Sharded) VALUE (5x R2 Sharded) COST (5x R2 Sharded) VALUE (1x R2 Default) COST (1x R2 Default)
    Write big file 530.10 MiB/s 0.01 s/file 452.66 MiB/s 0.01 s/file 448.20 MiB/s 0.01 s/file 230.82 MiB/s 0.02 s/file
    Read big file 1914.40 MiB/s 0.00 s/file 1545.95 MiB/s 0.00 s/file 1376.38 MiB/s 0.00 s/file 1276.38 MiB/s 0.00 s/file
    Write small file 2715.4 files/s 1.47 ms/file 682.8 files/s 5.86 ms/file 792.5 files/s 5.05 ms/file 675.7 files/s 5.92 ms/file
    Read small file 10069.0 files/s 0.40 ms/file 6299.4 files/s 0.63 ms/file 7827.1 files/s 0.51 ms/file 7833.1 files/s 0.51 ms/file
    Stat file 16545.3 files/s 0.24 ms/file 21365.2 files/s 0.19 ms/file 24308.1 files/s 0.16 ms/file 28226.1 files/s 0.14 ms/file
    FUSE operation 5767 operations 0.09 ms/op 5757 operations 0.42 ms/op 5750 operations 0.38 ms/op 5756 operations 0.41 ms/op
    Update meta 1617 operations 0.19 ms/op 5814 operations 0.72 ms/op 5740 operations 0.74 ms/op 5770 operations 0.70 ms/op
    Put object 37 operations 290.94 ms/op 107 operations 282.68 ms/op 94 operations 286.35 ms/op 118 operations 242.35 ms/op
    Get object 0 operations 0.00 ms/op 0 operations 0.00 ms/op 0 operations 0.00 ms/op 0 operations 0.00 ms/op
    Delete object 48 operations 103.83 ms/op 133 operations 116.84 ms/op 59 operations 117.93 ms/op 95 operations 83.94 ms/op
    Write into cache 404 operations 0.11 ms/op 404 operations 0.12 ms/op 404 operations 0.12 ms/op 404 operations 0.14 ms/op
    Read from cache 408 operations 0.06 ms/op 408 operations 0.06 ms/op 408 operations 0.05 ms/op 408 operations 0.06 ms/op
Sign In or Register to comment.