This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fw: xz instead of bzip2

Begin forwarded message:

Date: Mon, 5 Jun 2017 20:17:57 -0300
From: Matias Fonzo <>
To: R0b0t1 <>
Subject: Re: xz instead of bzip2

On Mon, 5 Jun 2017 15:44:25 -0500
R0b0t1 <> wrote:

> On Mon, Jun 5, 2017 at 1:08 PM, Matias Fonzo <>
> wrote:  
> > Dear GCC developers,
> >
> > What happens here !
> >
> > "Weekly snapshots now use xz compression [2017-05-24]
> >     ...instead of bzip2."
> >
> > Are you aware that a better implementation / format exists for this
> > purposes?:
> >
> >
> >
> > Review of xz:
> >
> >
> >
> >
> > Please do the right thing..
> >    
> Hello,
> That article is rather interesting but unfortunately it does not
> compare and contrast. It only lists facts of XZ but does so mostly in
> a vacuum, so I can't really tell whether or not those things are
> actually bad without trusting the author. Typically I would have no
> trouble doing this and would want to follow up on the author's claims,
> but in this case (the article is associated with lzip) I'm more wary
> of spending time doing that.
> My personal experience with the XZ format finds it better for
> releases, as compression tends to take longer (but give a better
> ratio) than decompression. Section 2.7 is kind of interesting in that
> case, as it claims that LZMA2 has a lower compression ceiling than
> LZMA. I don't know how far typical compression strays from the
> ideal.
> 2.9 is fairly minor but I do agree with it. It and the later sections
> lead me to my next point: most of the discussion seems to be on the
> failure of XZ's format to provide proper error checking. However in
> practice I have not found anyone to rely on the error checking
> provided by compressed data formats, and have had it suggested that
> doing so is an exercise in futility. That most software projects
> provide digests of their source archives seems to agree with this.
> Trying to provide error checking within the archive itself falls prey
> to the two general's problem
> ('_Problem) and based on the
> LZip author's argument error checking should probably be removed from
> both formats.
> While I'm not sure this should reflect on the author, I am not able to
> understand his conclusion in 2.10, 2.10.3, and 2.10.4. It doesn't seem
> to be explained how is he detecting false positives (the given
> formulas describe what he is doing with the false positive rate after
> it is found). Thus to me it seems entirely likely he might be
> interpreting things backwards; CRC algorithms actually pass through a
> large number of errors compared to digest algorithms, so figure 3
> makes no sense. Besides that I can't really find out how it is
> relevant to the main argument.
> Mr. Fonzo, if you think this is worth the consideration of the GCC
> developers, you might consider contacting the author of that web page
> so that he is able to explain it. However that is still predicated on
> one of the developers wanting to entertain a discussion.
> R0b0t1.  

Attachment: pgpUx4O_aXbgg.pgp
Description: OpenPGP digital signature

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]