Created attachment 24342 [details] preprocessed source [forwarded from http://bugs.debian.org/627084] seen with r174102 from the 4.6 branch, works with r173903 from the trunk Step to reproduce: wget 'http://pari.math.u-bordeaux.fr/~bill/pari-2.4.3.12000.tar.gz' tar xf pari-2.4.3.12000.tar.gz cd pari-2.4.3.alpha ./Configure make gp make bench Result: all test suite fail. Cause: The function pari_init_parser() in the file src/language/parsec.h is miscompiled. (This file is included by src/language/parse.y). If you replace the line 43: s_node.n=OPnboperator; by parsestate_reset(); (which does the same thing), then all test pass. It seems that the issue is that the function stack_alloc() is not inlined correctly, which cause pari_tree to be NULL (or maybe the call to pari_inline inside stack_alloc() is not inlined correctly. The command line used is gcc-4.6 -c -O3 -Wall -fno-strict-aliasing -fomit-frame-pointer -I. -I../src/headers -fPIC -o parse.o ../src/language/parse.c It also happens with -O2, but not with -O3 -fno-inline. It works fine with gcc 4.3, 4.4 and 4.5.
stack_init computes &pari_tree - &s_node which is undefined, stack_alloc then re-computes one via stack_base. That's broken as well. Not sure if this eventually causes the issue, but certainly the code is full of C implementation details that you can't capture in standard C.
Maybe related to PR49330.
It is triggered by revision 158045: http://gcc.gnu.org/ml/gcc-cvs/2010-04/msg00148.html
GCC 4.6.1 is being released.
(In reply to comment #4) > GCC 4.6.1 is being released. I see a similar bug with both gcc 4.6.0 and 4.6.1. In the library crypto++ (v. 5.6.1, http://www.cryptopp.com/), the algorithm Salsa20 produces wrong outputs when compiling with -O, -O2, -O3, but the bug dissapears as soon as -fno-inline is added. The output is always correct with gcc 4.5.x. These results were obtained on a 64-bit linux platform (ubuntu 11.04). I cannot post the source code that produces the errors; I do not have a reduced test case.
This PR lacks an executable testcase for easy verification of the bug. Thus, can people try with the fix for PR49651 installed?
the pari tests still fail
Created attachment 24790 [details] test case with Salsa20 in Crypto++
(In reply to comment #8) > Created attachment 24790 [details] > test case with Salsa20 in Crypto++ Sorry about my partial comment. Used the test case on source of gcc 4.6.1 + patch for PR49651 (applied the patch as found at http://gcc.gnu.org/viewcvs?view=revision&revision=176274 to tree-ssa-structalias.c), still does not work.
Can you attach preprocessed source of the Salsa20 testcase please?
Created attachment 24793 [details] the preprocessed source of salsa20 from Crypto++ with gcc 4.5.1, option -O2
Created attachment 24794 [details] the preprocessed source of Salsa20 from Crypto++, with gcc 4.6.0, option -O2
(In reply to comment #12) > Created attachment 24794 [details] > the preprocessed source of Salsa20 from Crypto++, with gcc 4.6.0, option -O2 I just discovered that the bug is present only when crypto++ is compiled with NDEBUG defined, which is not the case in the preprocessed files above. I will re-post updated files (output of the whole compilation of test case with -save-temps).
Created attachment 24796 [details] full testcase source with required files from Crypto++ 5.6.1 and build command the (slightly modified) testcase with Crypto++ 5.6.1, this time self-contained. All files except gcc_pr49140.cpp are unmodified form Crypto++. build command is in build.sh, with option -save-temps.
On Tue, 19 Jul 2011, grokbrsm at free dot fr wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49140 > > --- Comment #12 from Sébastien Kunz-Jacques <grokbrsm at free dot fr> 2011-07-19 16:14:29 UTC --- > Created attachment 24794 [details] > --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24794 > the preprocessed source of Salsa20 from Crypto++, with gcc 4.6.0, option -O2 Hm, unfortunately these don't seem to be self-contained (they fail to link).
Confirmed. Works with -O0, fails with -O[12] at least. Still fails on the 4.6 branch. Compiling salsa.cpp with -O1 is enough to trigger the error, compiling salsa.cpp with -O0 is enough to mitigate it.
(In reply to comment #16) > Confirmed. Works with -O0, fails with -O[12] at least. Still fails on the > 4.6 branch. > > Compiling salsa.cpp with -O1 is enough to trigger the error, compiling > salsa.cpp with -O0 is enough to mitigate it. yes, the wrong code is most probably generated in method Salsa20_Policy::OperateKeystream of salsa.cpp.
The inline asm in that function is invalid: : : "r" (m_rounds), "r" (input), "r" (iterationCount), "r" (m_state.data()), "r" (output), "r" (workspace.m_ptr) : "%eax", "%edx", "memory", "cc", "%xmm0", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "%xmm5", "%xmm6", "%xmm7", "%xmm8", "%xmm9", "%xmm10", "%xmm11", It tells the compiler that it only uses the 6 input registers, while it modifies 3 of them, e.g. the asm string contains: "add %1" ", " "1*16" ";" "sub %2" ", " "4" ";" "add %4" ", " "1*16" ";" GCC can assume that it will find the old content in the register after the inline asm and will find there something completely different. For the inputs that are clobbered, the pattern should use something like: void *dummy1; ... asm volatile ("..." : "=r" (dummy1) : "0" (input_value)); to say that it can't be used.
And that isn't the only bug in it, the inline asm also performs calls (which modify the bytes right below the stack), but when this function is compiled with -O1 and above (and not without -fno-inline), the function makes no function calls, therefore it happily uses red-zone and the embedded calls in the inline asm clobber the red-zone. Adding "sub rsp, 128;" and "add rsp, 128;" around the whole inline asm content well, inside of the intel syntax, fixes the testcase (of course, as written in the previous comment, it is still broken).
Ah, the crypto++ comments were just hijacking an unrelated bug for which no details have been provided. Please don't do this.
(In reply to comment #20) > Ah, the crypto++ comments were just hijacking an unrelated bug for which no > details have been provided. Please don't do this. Well, the symptoms looked similar: the error also dissapears with -fno-inline, so that it matches the bug report header pretty well. I'll file a bug report for Crypto++.
Waiting for a testcase.
Created attachment 25516 [details] Small test case with invalid code exhibiting the problem Here's a small test case with invalid code showing the problem with several gcc versions going back at least to 4.5. Compiling with -fno-tree-pta makes it behave as "expected". I do not believe the compiler to be at fault here. PARI is clearly full of undefined behaviours they really ought to fix rather than complain that doing so would change the ABI and blame the compiler which is only doing a good job following the spec.
GCC 4.6.2 is being released.
GCC 4.6.3 is being released.
The 4.6 branch has been closed, fixed in GCC 4.7.0.