Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Speeding up transfer to a storage server
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Speeding up transfer to a storage server

msattmsatt Member

I have two servers, one in London and the other storage server in L.A.
Both servers are from the same popular host advertised regularly on here.
Transferring 95GB of data from London to L.A. and in 24 hours only 41GB has transferred using rsync.
sshfs between servers i.e. mounted remote filesystem.

London
Processor : Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
CPU cores : 1 @ 2999.998 MHz
AES-NI : ✔ Enabled
VM-x/AMD-V : ❌ Disabled
RAM : 481.4 MiB
Swap : 1.5 GiB
Disk : 245.7 GiB

L.A. storage server
Processor : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
CPU cores : 1 @ 2699.998 MHz
AES-NI : ✔ Enabled
VM-x/AMD-V : ❌ Disabled
RAM : 1.9 GiB
Swap : 512.0 MiB
Disk : 984.3 GiB

iperf3 London > L.A.
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 102 MBytes 85.2 Mbits/sec 130 sender
[ 4] 0.00-10.00 sec 99.2 MBytes 83.2 Mbits/sec receiver

iperf3 L.A > London
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 99.2 MBytes 83.2 Mbits/sec 7 sender
[ 4] 0.00-10.00 sec 96.9 MBytes 81.3 Mbits/sec receiver

CPU and steal on both servers is very low.
Nothing major running on either servers other than rsync & ssh

Does anyone have suggestions as to anything I am doing wrong or is 41GB transfer in 24 hours reasonable?

Ticket has been opened with host also asking for suggestions, but so far no response.

Have not (as far as I am aware) been rate limited.

Help appreciated.

Comments

  • My experience with sshfs is that the latency for confirming writes is poor over that distance. Did you try it not using sshfs.

    Thanked by 2msatt cybertech
  • @tetech said:
    My experience with sshfs is that the latency for confirming writes is poor over that distance. Did you try it not using sshfs.

    So do you mean something like VPN (zerotier) and NFS?

  • @msatt said:

    @tetech said:
    My experience with sshfs is that the latency for confirming writes is poor over that distance. Did you try it not using sshfs.

    So do you mean something like VPN (zerotier) and NFS?

    Something like that. Either NFS over wireguard/zerotier, or just put the ssh command in rsync like /usr/bin/rsync -v --archive --compress --sparse --rsh="ssh -p${PORT}" /src user@host:/dest

    Thanked by 1msatt
  • Will give it a go just using rsync ssh and get back thanks. No point adding vpn.
    So you think the problem is sshfs?

  • If you have small files and a lot of storage why not create one big tar file then transfer it

  • msattmsatt Member
    edited August 2021

    rsync only updates / sends files that have changed, so once all data is transferred, subsequent bandwidth is much lower than big tar files.

    Edit
    @chocolateshirt Sorry probably misunderstood your comment. You mean tar, send and rsync later?
    Fair point but London server is quite short on space so tar would be really pushing for disk space.

  • @msatt said:
    Will give it a go just using rsync ssh and get back thanks. No point adding vpn.
    So you think the problem is sshfs?

    It mightn't be possible to give more than an educated guess, but that's the first thing I would suspect. Anyway, it is a simple experiment to prove/disprove it by just changing the rsync command, so it is a worthwhile exercise.

  • @msatt said:
    Edit
    @chocolateshirt Sorry probably misunderstood your comment. You mean tar, send and rsync later?
    Fair point but London server is quite short on space so tar would be really pushing for disk space.

    You could tar to stdout and write it to a file on the remote server over ssh, like tar --to-stdout | ssh ... > archive.tar. Fair point that this could be good for the initial transfer if you have zillions of small files.

    Thanked by 1chocolateshirt
  • @tetech said:
    You could tar to stdout and write it to a file on the remote server over ssh, like tar --to-stdout | ssh ... > archive.tar. Fair point that this could be good for the initial transfer if you have zillions of small files.

    Great minds think alike - just doing it now - certainly going faster.....
    So plan will likely be (once tar is received and unpacked in L.A.) use lsync from London to L.A. as number of files changing will be minimum. Not too worried about sshfs possible delays once the bulk of data is on storage server.

    Thanks guys I think I have a solution.

    Thanked by 1chocolateshirt
  • Try enabling BBR, depending on your OS this is a super simple process. Doing so on my servers increased the network speeds using rsync.

    Thanked by 3msatt 1gservers rm_
  • @Just295 said:
    Try enabling BBR, depending on your OS this is a super simple process. Doing so on my servers increased the network speeds using rsync.

    Good tip - updated both servers, but cant test as still sending 95GB to L.A.
    Will feedback.

  • Also beside bbr increasing your buffer should help over such a high latency link.

    net.core.rmem_default = 16777216
    net.core.wmem_default = 16777216
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 87380 16777216
    net.core.default_qdisc = fq
    net.ipv4.tcp_congestion_control = bbr

    Thanked by 2msatt vimalware
  • Look into using an instant deploy, hourly billed service to do server hopping LA -> NY -> UK

  • TimboJonesTimboJones Member
    edited August 2021

    @corbpie said:
    Look into using an instant deploy, hourly billed service to do server hopping LA -> NY -> UK

    That'll take longer...

    Edit: maybe not, but if the file needs to finish before sending out, it would take longer.

  • Provider had a look at systems and advised

    There was DDoS protection enabled on the subnet, and iperf was hitting some of the thresholds

    I totally appreciate the provider taking the time to try and diagnose.
    But I still had the problem.
    Spent a whole day trying to transfer files bewteen different servers etc but still everything was so slow.
    Decided to go back to basics and create some space and do a tar.gz (as originally suggested in this thread) for the folders that I want to rsync.
    Running tar I can see (in top) wait of around 50%
    This results in write to disk of around 550MB per MINUTE !!!

    So now I know why rsync over sshfs was so slow.
    Once the tar completes I will scp, expand and rsync should only be doing small nightly updates which should not cause an issue.

    Hope this thread helps others.
    Mike

    Thanked by 1sebkehl
  • rm_rm_ IPv6 Advocate, Veteran
    edited August 2021

    It is unclear why you use sshfs, while rsync can connect via ssh by itself, and will work MUCH more efficiently, if there are a lot of small files or a very deep directory tree:

    rsync /local/source/ user@remotehost:/remote/destination/

    Also as mentioned before, BBR is a must. After being selected it only applies to each new TCP connection, so as you noted it won't affect the already running rsync. Moreover, if you want to stick with sshfs, you'd have to unmount and remount it for BBR to take effect on that.

  • Thanks @rm_
    sshfs does make things easier on admin i.e. seeing files etc from a central point. But your point is well noted and I will switch to just using rsync of ssh.

  • eva2000eva2000 Veteran
    edited August 2021

    Make sure you're also using newer tar and rsync versions which support zstd compression can make a huge amount of difference https://blog.centminmod.com/2021/01/30/2214/fast-tar-and-rsync-transfer-speed-for-linux-backups-using-zstd-compression/ :)

    tested network transfer speed between the servers from USA East Coast to Mid USA was ~40-50MB/s over a 1Gbps network connection due to network and geographical distance.

    Over an SSH encrypted and zstd compressed netcat connections, I managed to transfer:

    144GB of file data (uncompressed size) in ~21.8 minutes with Tar + zstd and
    65GB of MariaDB 10 MySQL data (uncompressed size) in ~8.8 minutes with MariaBackup!

    Thanked by 2sebkehl vimalware
  • There are special tools for bulk internet transfers :
    * https://github.com/facebook/wdt (Need to compile C++ code)
    * https://www.slac.stanford.edu/~abh/bbcp/ (Binaries available)
    * http://hpnssh.org/ (Custom SSH implementation for high speed long distance transfers)

    For some context on tweaking Linux for internet transfers, check out https://fasterdata.es.net/assets/fasterdata/JT-201010.pdf

    BBR and CUBIC are good defaults for congestion control - depending on packet loss and window sizes. Prefer zstd compression over gzip

    Thanked by 1msatt
  • Going from one continent to another reduces data transfer speed a lot.

    Using rsync and copying lots of small files will take even longer.

    Sometimes I have backed up the data from LON, UK using Acronis Backup Cloud which is very fast as it is block-level. Once data is backed up to Acronis Backup Cloud can restore the data back to LA, USA server.

    That's a good workaround I found some years ago anyway.

    Thanked by 1dev_vps
  • I have now got my backup system in place and working.
    Yesterday I created the London TAR and used scp to copy it to LA where it was untarred.
    I updated my rsync scripts (as suggested) to use ssh NOT rsync over sshfs and a check with no file updates took just 12 seconds.
    Last night the cron ran and with London updates, took just 4 minutes to complete.

    So to summarise for anyone in similar position
    Enable BBR in Kernel.
    Don't try to initially rsync over long distances, instead tar, scp, untar, then rsync over ssh

    My key problems were
    1. Very high waits for disk (reaching 85% during tar) at both London and L.A. - both sites with same provider.
    2. DDOS protection creating issues during iperf3 and likely during file transfer.
    3. Distance / latency between sites.

    HTH and thank you for all suggestions.
    Mike

  • rm_rm_ IPv6 Advocate, Veteran

    @msatt said: Don't try to initially rsync over long distances, instead tar, scp, untar, then rsync over ssh

    Personally I would reverse that, rsync over long distances should be fine, just not over sshfs.

    So rsync first, and proceed to convoluted solutions only if that appears to be unacceptably slow.

    Thanked by 2Falzo vimalware
  • @rm_ said:

    @msatt said: Don't try to initially rsync over long distances, instead tar, scp, untar, then rsync over ssh

    Personally I would reverse that, rsync over long distances should be fine, just not over sshfs.

    So rsync first, and proceed to convoluted solutions only if that appears to be unacceptably slow.

    fully agreed. as other said have rsync use zstd or depending on the data maybe even turn of compression in ssh completely to trade in some bandwidth for less cpu power (depending on overall size of the files and available ressources of course).

Sign In or Register to comment.