A stream of thoughts on compression
It's a stream about compression. Get it? Compression stream? Ha ha.
Zip would be better if compression was cross-file
Throughout my experiments, I've found that ZIP was generally worse than other methods, mainly because each file gets its own stream. This means that the file name is stored in plain text, and data shared across files is treated as separate. This makes it just larger, as demonstrated in the FSR breakdown.
NEU breakdown
NEU is a large Java Archive, so I've been trying to figure out effective ways to reduce its size for a while now. I recently tried making tarballs organized by file type and using xz compression - here's what that yielded.
Type | .tar | .tar.xz |
Archives | 12.380 MB | 10.302 MB |
Images | 13.486 MB | 12.476 MB |
Class files | 12.667 MB | 2.135 MB |
Other files | 2.999 MB | 0.346 MB |
Sum | 41.5 MB | 25.3 MB |
For context: The original zip was 29.1 MB and the original sum of file contents were 37.8 MB.
After optimizing images (with zopflipng)
Type | .tar | .tar.xz |
Images | 11.940 MB | 11.187 MB |
Total | 40.0 MB | 24.0 MB |
Yes, this means that just using zopflipng and using xz could subtract 5 MB from the archive. That doesn't mean that's practical, but it's interesting to look at.
FSR breakdown
FSR is a large texture pack using PNGs and ZIP, and my endeavors have shown that the archive format can make as much a difference as the file format.
"Compressed": zopflipng (with --lossy_transparent -m
) and
rewritten JSON
Format | Base size | Compressed size |
Folder | 4.4 MB | 2.1 MB |
.zip | 4.5 MB - 4.9 MB | 3.3 MB |
.tar.gz | 1.9 - 2.0 MB | 885 - 957 kB |
.tar.xz | 1.7 MB | 704 kB |