I only use gzip when creating files. Most archive programs on other systems can cope with a .tgz file, but everything else ends up being potluck whether you can use the file elsewhere.
@Kousaka said:
Just use tar without gzip or other compression. Basically read the files and merge them together and done.
It does come down to the requirements. @Kousaka is right. If all you are doing is bundling files together in a single bundle, the fastest solution will be tar without compression.
If you want to reduce storage needs, then other considerations come into play. Compression performance (both speed and resulting file size), cross-platform compatibility, the type of data you are compressing, lossy vs. lossless, etc.
I'd use lzop for fast and poor compression. If needing absolutely the fastest, maybe even lzop -1. Never heard of lz4 before though, should check that out to see how it compares.
edit: Yeah, at least from compressing a large mixed content file (bootable OS image) lz4 is both faster (by 25%) and compresses better (by 5%) than lzop. Nice!
Tar without gzip is good enough if only for merge all sort of files. I don't think compression ratio varies for different compression algorithms when dealing with video or pictures. So I guess tar with default gzip is fair enough.
FYI, web site at https://facebook.github.io/zstd/ has outdated zstd 1.4.5 versions while 1.5.2 is latest and has improved compression/decompression speeds over versions. Also web site only tested zstd 1.4.5 to negative -5 compression level --fast=5 while there's up to 131072 negative compression levels IIRC.
As far as I know it is not standardized which encoding is used inside a .zip for filenames. So in a non-Latin script country it is common to get a file sent from some stupid Windows system and who-knows-what archiver, which uncompresses into gibberish filenames on GNU/Linux. For that reason I'd suggest to prefer the modern and also universal enough .7z in all cases, including this no-compression one.
As far as I know it is not standardized which encoding is used inside a .zip for filenames.
Encoding and filenames and filecase and whatnot is another (huge) issue - archivers and filesystems and combination thereof...
"Universal" meant "supported out of the box (or with negligible effort) even on grandpa's computer". (7z is gaining some traction indeed, but I remember wasting too much time trying to have some poor guy uncompress 7z email attachments on Mac...)
As far as I know it is not standardized which encoding is used inside a .zip for filenames. So in a non-Latin script country it is common to get a file sent from some stupid Windows system and who-knows-what archiver, which uncompresses into gibberish filenames on GNU/Linux. For that reason I'd suggest to prefer the modern and also universal enough .7z in all cases, including this no-compression one.
I never did track down the root problem, but I remember about a decade ago finding that 7zip would often create zip files that some (but not all) other compliant zip readers would fail to parse. There wouldn't have been anything special about the filenames, so it was doing something funky with the encoding itself. My guess is that it was exceeding the 16KB sliding window when compressing, but whatever the cause, every couple of days it'd spit out a file that'd be rejected by the recipient as being corrupt, even though it'd open fine in 7zip itself and windows explorer. I've never used 7zip since then.
Comments
Zstandard balances great between compression ratio and speed, but if you are only talking about speed, then LZ4 blows anyone else.
Just use tar without gzip or other compression. Basically read the files and merge them together and done.
Poll updated/fixed.
zstd hands down - benchmarks https://blog.centminmod.com/2021/01/30/2214/fast-tar-and-rsync-transfer-speed-for-linux-backups-using-zstd-compression/
2nd place = pigz
I only use gzip when creating files. Most archive programs on other systems can cope with a .tgz file, but everything else ends up being potluck whether you can use the file elsewhere.
Brotli.
For compression/time ratio, zstd.
But most of the time I use zip and tar+gzip.
It does come down to the requirements. @Kousaka is right. If all you are doing is bundling files together in a single bundle, the fastest solution will be tar without compression.
If you want to reduce storage needs, then other considerations come into play. Compression performance (both speed and resulting file size), cross-platform compatibility, the type of data you are compressing, lossy vs. lossless, etc.
Reminder:
Compress first, then encrypt. :-)
tar with the pigz compression, zoomin and on most systems!
I just use gzip with the default settings
tar and 7z
This question is so vague that it cannot be answered properly.
Things need to be asked like what is being compressed?, is it on the fly or archival?, are you doing it filesystem level or block? etc..
but if where to answer i would say zstd is slowly becoming my favorite for my zfs arrays and etc.
I'd use
lzop
for fast and poor compression. If needing absolutely the fastest, maybe evenlzop -1
. Never heard of lz4 before though, should check that out to see how it compares.edit: Yeah, at least from compressing a large mixed content file (bootable OS image) lz4 is both faster (by 25%) and compresses better (by 5%) than lzop. Nice!
Tar without gzip is good enough if only for merge all sort of files. I don't think compression ratio varies for different compression algorithms when dealing with video or pictures. So I guess tar with default gzip is fair enough.
By the way, I'm using it for a daily cronjob for a big database backup (70 GB+).
mysqldump pipes to a compression utility.
Compression time is my 1st priority and second priority is compression size.
For zstd, from their own site:
http://facebook.github.io/zstd/
Seems LZ4 still wins both for compression and decompression speed.
Might depend on system i.e. memory and disk speed but compression (not decompression), tar + zstd still fastest for me see tar compression benchmarks at https://blog.centminmod.com/2021/01/30/2214/fast-tar-and-rsync-transfer-speed-for-linux-backups-using-zstd-compression/ and I only tested zstd down to negative -10 compression level for speed over compression ratio/size. There's more negative levels you could go down to speed up compression times.
FYI, web site at https://facebook.github.io/zstd/ has outdated zstd 1.4.5 versions while 1.5.2 is latest and has improved compression/decompression speeds over versions. Also web site only tested zstd 1.4.5 to negative -5 compression level
--fast=5
while there's up to 131072 negative compression levels IIRC.See change log and benchmarks listed at https://github.com/facebook/zstd/releases
zstd 1.5.1 had a nice bump in performance from my experience and benchmarks. From the githutb release change log for 1.5.1
Compiler Scenario v1.5.0 Speed v1.5.1 Speed Delta
gcc-11 Literal compression - 128KB block 748 MB/s 927 MB/s +23.9%
clang-13 Literal compression - 128KB block 810 MB/s 927 MB/s +14.4%
gcc-11 Literal compression - 4KB block 223 MB/s 321 MB/s +44.0%
clang-13 Literal compression - 4KB block 224 MB/s 310 MB/s +38.2%
gcc-11 Literal decompression - 128KB block 1164 MB/s 1500 MB/s +28.8%
clang-13 Literal decompression - 128KB block 1006 MB/s 1504 MB/s +49.5%
zip -0
Fast and universal.
As far as I know it is not standardized which encoding is used inside a .zip for filenames. So in a non-Latin script country it is common to get a file sent from some stupid Windows system and who-knows-what archiver, which uncompresses into gibberish filenames on GNU/Linux. For that reason I'd suggest to prefer the modern and also universal enough .7z in all cases, including this no-compression one.
(en> @rm_ said:
Encoding and filenames and filecase and whatnot is another (huge) issue - archivers and filesystems and combination thereof...
"Universal" meant "supported out of the box (or with negligible effort) even on grandpa's computer". (7z is gaining some traction indeed, but I remember wasting too much time trying to have some poor guy uncompress 7z email attachments on Mac...)
I never did track down the root problem, but I remember about a decade ago finding that 7zip would often create zip files that some (but not all) other compliant zip readers would fail to parse. There wouldn't have been anything special about the filenames, so it was doing something funky with the encoding itself. My guess is that it was exceeding the 16KB sliding window when compressing, but whatever the cause, every couple of days it'd spit out a file that'd be rejected by the recipient as being corrupt, even though it'd open fine in 7zip itself and windows explorer. I've never used 7zip since then.