Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


NFS v4 over higher latency network (Internet)?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

NFS v4 over higher latency network (Internet)?

VitaVita Member
edited August 2019 in Help

Hi,

I've been wondering if anyone can share experiences regarding the performance of NFS v4 over higher latency networks (10-30ms)?

I have a setup that has local NFS servers (1-2ms), that I have to migrate and have the service going. In order to do this, I would utilize VMWare L2 stretch to expand the local network, but this adds latency overhead of 10-30ms.

Since NFS is chatty, each and every request would have additional overhead.

In my personal experience, this would not be good for NFS, but I'm wondering if some tuning of NFS server/client would allow me to have solid performance.

Specs:
Network: 1Gbps.
Ping: 10-30ms.
Network utilization peak on NFS server: 100Mbps.
Disks (R/W): 350/350 MB/s.
NFS mount fstab example:

storage1-fip.local:/home/content/    /home/content/     nfs     rw,vers=4.0,rsize=65536,wsize=65536,hard,sync,proto=tcp,timeo=600,noatime,nodiratime,retrans=2,sec=sys            0       0

There is an IP that floats between DRBD 1 and DRBD2 nodes and it is used to mount the appropriate active DRBD server.

   DC 1                                           DC 2
  +--------------------------------------+       +------------------------------------+
  |                                      |       |                                    |
  |    +---------------------------+     |       |   +---------------------------+    |
  |    |      NFS Floating IP      |     |       |   |      NFS Floating IP      |    |
  |    |                           |     |       |   |                           |    |
  |    +---------------------------+     |       |   +---------------------------+    |
  |    ||   DRBD 1   |            ||     |       |   ||            |   DRBD 2   ||    |
  |    ||            |            ||     |       |   ||            |            ||    |
  |    ||            |            ||     | DRBD sync ||            |            ||    |
  |    ||            |     +--------------------------------------------->      ||    |
  |    ||            |            ||     |           ||            |            ||    |
  |    ||            |            ||     |       |   ||            |            ||    |
  |    |---------------------------|     |       |   |---------------------------|    |
  |    +---------------------------+     |       |   +---------------------------+    |
  |                 ^                    |       |                 ^                  |
  |                 |                    |       |                 |                  |
  |                 |                    |       |                 |                  |
  |    +---------------------------+     |       |                 |                  |
  |    |                           |     |       |                 |                  |
  |    |        WEB CLUSTER        +-------------------------------+                  |
  |    |                           |     |       |                                    |
  |    +---------------------------+     |       |                                    |
  |                                      |       |                                    |
  +----------------^---------------------+       +------------------------------------+
                   |                                                 |
                   |                 L2 Stretch                      |
                   +-------------------------------------------------+

Additionally, I would like to mention that NFS servers are used by Webhosting and Mail servers to read/write data.

If anyone has any experiences please let me know.

Thanks in advance!

Comments

  • hzrhzr Member

    This sounds kind of bad especially if, for example, you might be seeking through files a ton instead of downloading whole thing (think webserver range-bytes Partial Content requests)

  • Vita said: I have a setup that has local NFS servers (1-2ms), that I have to migrate and have the service going.

    I'm not sure I understand what you mean here - is the key problem migrating a live service without any downtime?

    Is there an option to for e.g. rsync first, take a small downtime to migrate the changed files, start the service? This may help reduce the downtime window esp. if the actual files that change is relatively small and can be rsync'd within a small window (even in parts/pieces).

  • VitaVita Member

    I will update the original message as well. The setup is actually two DRBD nodes. Both of them export NFS filesystem over network over a floating IP.

    The idea is to migrate the non active DRBD node to the new location, and utilize DRBD sync between the primary and secondary nodes. After the data has been synced move the floating IP to point to the server on new location. This way all the data communication to the NFS server would be increased for 10-30ms. No downtime should be induced at all if possible. I wanted to avoid doing rsync because DRBD already has its own sync, and later resynchronization may be triggered when DRBD connection is setup again.

       DC 1                                           DC 2
      +--------------------------------------+       +------------------------------------+
      |                                      |       |                                    |
      |    +---------------------------+     |       |   +---------------------------+    |
      |    |      NFS Floating IP      |     |       |   |      NFS Floating IP      |    |
      |    |                           |     |       |   |                           |    |
      |    +---------------------------+     |       |   +---------------------------+    |
      |    ||   DRBD 1   |            ||     |       |   ||            |   DRBD 2   ||    |
      |    ||            |            ||     |       |   ||            |            ||    |
      |    ||            |            ||     | DRBD sync ||            |            ||    |
      |    ||            |     +--------------------------------------------->      ||    |
      |    ||            |            ||     |           ||            |            ||    |
      |    ||            |            ||     |       |   ||            |            ||    |
      |    |---------------------------|     |       |   |---------------------------|    |
      |    +---------------------------+     |       |   +---------------------------+    |
      |                 ^                    |       |                 ^                  |
      |                 |                    |       |                 |                  |
      |                 |                    |       |                 |                  |
      |    +---------------------------+     |       |                 |                  |
      |    |                           |     |       |                 |                  |
      |    |        WEB CLUSTER        +-------------------------------+                  |
      |    |                           |     |       |                                    |
      |    +---------------------------+     |       |                                    |
      |                                      |       |                                    |
      +----------------^---------------------+       +------------------------------------+
                       |                                                 |
                       |                 L2 Stretch                      |
                       +-------------------------------------------------+
    
    

    Web cluster will be then pull/push data via floating IP from the DRBD 2 node that will be primary when it is migrated. But the DRBD sync will also have latency overhead between now seprate DRBD 1 and DRBD 2 nodes in different DCs.

  • drserverdrserver Member, Host Rep

    It will work like shit. Do a small test 1st before doing anything spectacular. Just try to do a ls on folder. On some occasions ls can take up to second (50ms between locations). Also you will need to move this thru some kind of VPN, you will have encryption and decryption overhead also.

  • FHRFHR Member, Host Rep

    Forget it, it will work like crap. Same story with DRBD.

    Any kind of master/master synchronous replication (which you do need) works like shit over high latency connections.

    Best case scenario, you end up with downtime. Worst case scenario, data loss.

  • jlayjlay Member

    NFS v4 is the stateful version, and that extra chatter is going to be rough over a high latency network

Sign In or Register to comment.