All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Speeding up transfer to a storage server
I have two servers, one in London and the other storage server in L.A.
Both servers are from the same popular host advertised regularly on here.
Transferring 95GB of data from London to L.A. and in 24 hours only 41GB has transferred using rsync.
sshfs between servers i.e. mounted remote filesystem.
London
Processor : Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
CPU cores : 1 @ 2999.998 MHz
AES-NI : ✔ Enabled
VM-x/AMD-V : ❌ Disabled
RAM : 481.4 MiB
Swap : 1.5 GiB
Disk : 245.7 GiB
L.A. storage server
Processor : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
CPU cores : 1 @ 2699.998 MHz
AES-NI : ✔ Enabled
VM-x/AMD-V : ❌ Disabled
RAM : 1.9 GiB
Swap : 512.0 MiB
Disk : 984.3 GiB
iperf3 London > L.A.
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 102 MBytes 85.2 Mbits/sec 130 sender
[ 4] 0.00-10.00 sec 99.2 MBytes 83.2 Mbits/sec receiver
iperf3 L.A > London
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 99.2 MBytes 83.2 Mbits/sec 7 sender
[ 4] 0.00-10.00 sec 96.9 MBytes 81.3 Mbits/sec receiver
CPU and steal on both servers is very low.
Nothing major running on either servers other than rsync & ssh
Does anyone have suggestions as to anything I am doing wrong or is 41GB transfer in 24 hours reasonable?
Ticket has been opened with host also asking for suggestions, but so far no response.
Have not (as far as I am aware) been rate limited.
Help appreciated.
Comments
My experience with sshfs is that the latency for confirming writes is poor over that distance. Did you try it not using sshfs.
So do you mean something like VPN (zerotier) and NFS?
Something like that. Either NFS over wireguard/zerotier, or just put the ssh command in rsync like
/usr/bin/rsync -v --archive --compress --sparse --rsh="ssh -p${PORT}" /src user@host:/dest
Will give it a go just using rsync ssh and get back thanks. No point adding vpn.
So you think the problem is sshfs?
If you have small files and a lot of storage why not create one big tar file then transfer it
rsync only updates / sends files that have changed, so once all data is transferred, subsequent bandwidth is much lower than big tar files.
Edit
@chocolateshirt Sorry probably misunderstood your comment. You mean tar, send and rsync later?
Fair point but London server is quite short on space so tar would be really pushing for disk space.
It mightn't be possible to give more than an educated guess, but that's the first thing I would suspect. Anyway, it is a simple experiment to prove/disprove it by just changing the rsync command, so it is a worthwhile exercise.
You could tar to stdout and write it to a file on the remote server over ssh, like
tar --to-stdout | ssh ... > archive.tar
. Fair point that this could be good for the initial transfer if you have zillions of small files.Great minds think alike - just doing it now - certainly going faster.....
So plan will likely be (once tar is received and unpacked in L.A.) use lsync from London to L.A. as number of files changing will be minimum. Not too worried about sshfs possible delays once the bulk of data is on storage server.
Thanks guys I think I have a solution.
Try enabling BBR, depending on your OS this is a super simple process. Doing so on my servers increased the network speeds using rsync.
Good tip - updated both servers, but cant test as still sending 95GB to L.A.
Will feedback.
Also beside bbr increasing your buffer should help over such a high latency link.
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
Look into using an instant deploy, hourly billed service to do server hopping LA -> NY -> UK
That'll take longer...
Edit: maybe not, but if the file needs to finish before sending out, it would take longer.
Provider had a look at systems and advised
I totally appreciate the provider taking the time to try and diagnose.
But I still had the problem.
Spent a whole day trying to transfer files bewteen different servers etc but still everything was so slow.
Decided to go back to basics and create some space and do a tar.gz (as originally suggested in this thread) for the folders that I want to rsync.
Running tar I can see (in top) wait of around 50%
This results in write to disk of around 550MB per MINUTE !!!
So now I know why rsync over sshfs was so slow.
Once the tar completes I will scp, expand and rsync should only be doing small nightly updates which should not cause an issue.
Hope this thread helps others.
Mike
It is unclear why you use sshfs, while rsync can connect via ssh by itself, and will work MUCH more efficiently, if there are a lot of small files or a very deep directory tree:
rsync /local/source/ user@remotehost:/remote/destination/
Also as mentioned before, BBR is a must. After being selected it only applies to each new TCP connection, so as you noted it won't affect the already running rsync. Moreover, if you want to stick with sshfs, you'd have to unmount and remount it for BBR to take effect on that.
Thanks @rm_
sshfs does make things easier on admin i.e. seeing files etc from a central point. But your point is well noted and I will switch to just using rsync of ssh.
Make sure you're also using newer tar and rsync versions which support zstd compression can make a huge amount of difference https://blog.centminmod.com/2021/01/30/2214/fast-tar-and-rsync-transfer-speed-for-linux-backups-using-zstd-compression/
There are special tools for bulk internet transfers :
* https://github.com/facebook/wdt (Need to compile C++ code)
* https://www.slac.stanford.edu/~abh/bbcp/ (Binaries available)
* http://hpnssh.org/ (Custom SSH implementation for high speed long distance transfers)
For some context on tweaking Linux for internet transfers, check out https://fasterdata.es.net/assets/fasterdata/JT-201010.pdf
BBR and CUBIC are good defaults for congestion control - depending on packet loss and window sizes. Prefer zstd compression over gzip
Going from one continent to another reduces data transfer speed a lot.
Using rsync and copying lots of small files will take even longer.
Sometimes I have backed up the data from LON, UK using Acronis Backup Cloud which is very fast as it is block-level. Once data is backed up to Acronis Backup Cloud can restore the data back to LA, USA server.
That's a good workaround I found some years ago anyway.
I have now got my backup system in place and working.
Yesterday I created the London TAR and used scp to copy it to LA where it was untarred.
I updated my rsync scripts (as suggested) to use ssh NOT rsync over sshfs and a check with no file updates took just 12 seconds.
Last night the cron ran and with London updates, took just 4 minutes to complete.
So to summarise for anyone in similar position
Enable BBR in Kernel.
Don't try to initially rsync over long distances, instead tar, scp, untar, then rsync over ssh
My key problems were
1. Very high waits for disk (reaching 85% during tar) at both London and L.A. - both sites with same provider.
2. DDOS protection creating issues during iperf3 and likely during file transfer.
3. Distance / latency between sites.
HTH and thank you for all suggestions.
Mike
Personally I would reverse that, rsync over long distances should be fine, just not over sshfs.
So rsync first, and proceed to convoluted solutions only if that appears to be unacceptably slow.
fully agreed. as other said have rsync use zstd or depending on the data maybe even turn of compression in ssh completely to trade in some bandwidth for less cpu power (depending on overall size of the files and available ressources of course).