This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

lrzip: extreme compression (but beware its slow decompression speed)

In case you're evaluating what compression programs to use...

This started off as a comparison of xz and lzip,
but then I added lzrip to the mix.

Sometimes it's useful to have an idea of how far from "ideal"
a compression program is.  I'm not claiming to have the answer,
but merely sharing my surprise at how far off xz and lzip are
when it comes to the size of the compressed result.

I started off by downloading the gcc-4.7.0.tar.bz2 release tarball
and decompressing it, then recompressing using bzip2, lzip, xz and lrzip:
(on a 6/12-core Fedora 17 x86_64 system with plenty of RAM)

  KiB   compression
 size   time m:ss  file name
------  --------   -----------------
514400     NA      gcc-4.7.0.tar
 80588  0:58.12    gcc-4.7.0.tar.bz2 (-9)
 59556  6:16.61    gcc-4.7.0.tar.lz (-9)
 58640  5:55.78    gcc-4.7.0.tar.xz (-9e)
 48876  2:46[*]    gcc-4.7.0.tar.lrz (-z -L8 -w2000)

[*] multi-threaded; I think it had at least 6 or 7 cores busy at one point.
This is using the latest, v0.47-590-ga9ba55f, from the upstream repo,

The above shows that xz compresses both faster (by 5%)
and better (by 916 KiB, or ~1.5%).

It also shows that lrzip compresses extremely well, saving over 9MiB
(aka more than 16%) over xz with its -9e options.

More importantly, what about decompression speed?
The compression happens relatively rarely, by the person who prepares
a release, but then many people download and decompress the result.

(the following xz and lzip times are each best-of-3)

    $ env time xz -dc gcc-4.7.0.tar.xz > /dev/null

    $ env time --f=%e lzip -dc gcc-4.7.0.tar.lz > /dev/null

    $ env time --f=%e bzip2 -dc gcc-4.7.0.tar.bz2 > /dev/null

    $ ./lrzip -d -o - gcc-4.7.0.tar.lrz > /dev/null
    3:36.12 (note, that's 3.5 *minutes* to decompress on a 12-core system)

That shows another reason to prefer xz over lzip.
xz decompresses this tarball in 28% less time than lzip.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]