Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Distributed file systems
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Distributed file systems

I am planning to create a RAID-like redundant distributed fs to connect a few storage servers to one single "virtual" fs. I know there are a few alternatives to create such a fs, e.g. glusterfs, xtreemfs, ceph, Tahoe-LAFS etc., while all of them have their own feature sets.
I am curious about those people here, which are already using a distributed fs:

  • Which software are you using?
  • In which mode is it operating? Mirroring / striping / distributing?
  • Are the storage nodes in one location (geographically) or scattered to different locations?
  • How is it performing? Latencies, transfer rates etc.
  • Maybe you have some favorite feature, which the alternatives are missing?

This could help to get an overview of the possible use cases.
Thanks!

Comments

  • my suggestion is you do your own testing.

    In all cases you should run them with mirroring. It is VERY idiotic to run them in striping. Even glusterfs says you should never never never never use striping unless for a few very very very specific cases.

    Have I highlighted enough not to use striping? ..... NOPE, DON'T USE STRIPING!

    You will never be able to expect 100% uptime on any nodes and so you should make a few replicas period, actually the more the better.

    Thanked by 1tehdartherer
  • Certainly I will do my own tests. I am not really interested in hints for my special use case, but wanted to see what experiences people have made in different situations.
    Also, it would be interesting to hear how resource needs scale up in different cases on the server / node side.

  • I like GlusterFS a lot since it's master-free, and the management tools and performance are not too bad. It's definitely sensitive to latency, so keep them close together, unless you're doing a distributed setup (which works a bit differently than the other modes, with librsync under the hood, IIRC). I have a simple 2-node cluster (mirror mode) on AWS with about a dozen clients attached. It works pretty well so long as you don't bombard it too hard.

    Thanked by 1tehdartherer
  • GlusterFS is ok but love to fallover under load and if there are any latency issues. You must also make sure not to expose your GlusterFS over your public interface otherwise there are plenty of simple DoS attacks to shut the daemons down.

    XtreemFS is also ok, I have a test bed here at the moment running some loads.

    Ultimately I think Ceph is probably the only high-performance/capacity production-ready one you've mentioned.

    Thanked by 1tehdartherer
  • I use XtreemFS with striping and replication. Replications for each file plus stripes - basically a distributed RAID 10 (or is that a RAID 01?). It performs well and is currently bottlenecking at my Tinc VPN (maxes at 1.6MB/s for some reason). I'm in the process of switching it over to SSL encryption for a WAN deployment.

    I like XtreemFS for the automatic failover and operations for WAN deployments - great for LowEndBoxes. XtreemFS also has a working Windows port - automatic 10 points for me. :P

    As I remember, Tahoe-LAFS is a totally different beast. With Tahoe-LAFS you have to sacrifice speed and performance for data security and safety. With the other systems you mentioned there is a basic assumption that you can trust each host. With Tahoe-LAFS you don't need that assumption.

    Mun said: Have I highlighted enough not to use striping? ..... NOPE, DON'T USE STRIPING!

    Why? I would say that a RAID 10 is prefered over a RAID 1 for speed. Of course I might be missing something.

    Thanked by 1tehdartherer
  • I think he meant straight striping -- IE: RAID0.
    There's nothing wrong but everything right with a mirror+striping setup (RAID10)

    Thanked by 1Silvenga
  • tehdartherertehdartherer Member
    edited September 2014

    The Xtreemfs Windows client seems to have issues with corruption for files >4 GB. Nonetheless clients for multiple platforms is really a nice thing to have.
    With the release of glusterfs 3.5 they introduced at rest encryption. Unfortunately it is hitting the performance quite hard, saw some dd numbers ~50 MB/s dropping to ~4 MB/s.

    @Silvenga said:
    I use XtreemFS with striping and replication. Replications for each file plus stripes - basically a distributed RAID 10 (or is that a RAID 01?). It performs well and is currently bottlenecking at my Tinc VPN (maxes at 1.6MB/s for some reason). I'm in the process of switching it over to SSL encryption for a WAN deployment.

    I like XtreemFS for the automatic failover and operations for WAN deployments - great for LowEndBoxes. XtreemFS also has a working Windows port - automatic 10 points for me. :P

    >

Sign In or Register to comment.