A stream of thoughts on compression

It's a stream about compression. Get it? Compression stream? Ha ha.

Zip would be better if compression was cross-file

Throughout my experiments, I've found that ZIP was generally worse than other methods, mainly because each file gets its own stream. This means that the file name is stored in plain text, and data shared across files is treated as separate. This makes it just larger, as demonstrated in the FSR breakdown.

NEU breakdown

NEU is a large Java Archive, so I've been trying to figure out effective ways to reduce its size for a while now. I recently tried making tarballs organized by file type and using xz compression - here's what that yielded.

Type.tar.tar.xz
Archives12.380 MB10.302 MB
Images13.486 MB12.476 MB
Class files12.667 MB2.135 MB
Other files2.999 MB0.346 MB
Sum41.5 MB25.3 MB

For context: The original zip was 29.1 MB and the original sum of file contents were 37.8 MB.

After optimizing images (with zopflipng)

Type.tar.tar.xz
Images11.940 MB11.187 MB
Total40.0 MB24.0 MB

Yes, this means that just using zopflipng and using xz could subtract 5 MB from the archive. That doesn't mean that's practical, but it's interesting to look at.

FSR breakdown

FSR is a large texture pack using PNGs and ZIP, and my endeavors have shown that the archive format can make as much a difference as the file format.

"Compressed": zopflipng (with --lossy_transparent -m) and rewritten JSON

FormatBase sizeCompressed size
Folder4.4 MB2.1 MB
.zip4.5 MB - 4.9 MB3.3 MB
.tar.gz1.9 - 2.0 MB885 - 957 kB
.tar.xz1.7 MB704 kB