This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
- From: "hjl.tools at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 06 Mar 2014 21:56:41 +0000
- Subject: [Bug middle-end/60418] [4.9 Regression] 435.gromacs in SPEC CPU 2006 is miscompiled
- Auto-submitted: auto-generated
- References: <bug-60418-4 at http dot gcc dot gnu dot org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418
--- Comment #10 from H.J. Lu <hjl.tools at gmail dot com> ---
Sources have many FP loops contains codes like:
rsq11 = dx11*dx11+dy11*dy11+dz11*dz11;
When they are compiled with
-O3 -funroll-loops -ffast-math -fwhole-program -flto=jobserver
-fuse-linker-plugin
LTO input IRs contain statements like
powmult_241 = dy11_71 * dy11_71;
powmult_240 = dz11_72 * dz11_72;
_699 = powmult_240 + powmult_80;
rsq11_77 = _699 + powmult_241;
During the final LTO link, lto1 repeatedly removes loop a preheader
in one pass and adds it back in the next pass. Each removal/add
changes the statements to
powmult_213 = dy11_71 * dy11_71;
_75 = powmult_213 + powmult_80;
powmult_244 = dz11_72 * dz11_72;
rsq11_77 = _75 + powmult_244;
Each such re-order may change the FP result slightly. They
can accumulate to such a degree that the end result is
outside of tolerance.