The following code is wrongly compiled on a mips64el target when using -fexpensive-optimizations: #include <inttypes.h> #include <stdio.h> int compare(uint64_t state, uint32_t *last, uint8_t buf) { if (*last == ((state | buf) & 0xFFFFFFFF)) { printf("B %"PRIx64" %"PRIx32"\n", state, *last); return 0; } return 1; } It produces the following code: 00000000000009a0 <compare>: 9a0: 30c200ff andi v0,a2,0xff 9a4: 8ca60000 lw a2,0(a1) 9a8: 00441025 or v0,v0,a0 9ac: 10460004 beq v0,a2,9c0 <compare+0x20> 9b0: 24020001 li v0,1 9b4: 03e00008 jr ra 9b8: 00000000 nop 9bc: 00000000 nop ... Note how the comparison is done incorrectly, dropping the & 0xFFFFFFFF and sign-extending the value when loading *last. Using -fno-expensive-optimizations produces the following valid code: 00000000000009a0 <compare>: 9a0: 30c200ff andi v0,a2,0xff 9a4: 8ca60000 lw a2,0(a1) 9a8: 00441025 or v0,v0,a0 9ac: 7cc3f803 dext v1,a2,0x0,0x20 9b0: 7c42f803 dext v0,v0,0x0,0x20 9b4: 10620004 beq v1,v0,9c8 <compare+0x28> 9b8: 24020001 li v0,1 9bc: 03e00008 jr ra 9c0: 00000000 nop 9c4: 00000000 nop 9c8: 67bdfff0 daddiu sp Here both values are zero extended to 32-bit before the comparison. Alternatively, when removing the line with the printf, GCC also produces correct code, which is more optimized: 0000000000000960 <compare>: 960: 30c600ff andi a2,a2,0xff 964: 00c42025 or a0,a2,a0 968: 9ca20000 lwu v0,0(a1) 96c: 7c86f803 dext a2,a0,0x0,0x20 970: 00c21026 xor v0,a2,v0 974: 03e00008 jr ra 978: 0002102b sltu v0,zero,v0 97c: 00000000 nop Note that at least gcc 4.8.5, 4.9.3 and 5.2.1 are affected.
Can you try the patch in https://gcc.gnu.org/ml/gcc-patches/2012-05/msg00401.html ? I never got around to updating it for the comments.
Thanks for the patch. I have just tested it, and it indeed fixes the issue from the testcase. I am currently trying to build ffmpeg, from which the testcase has been extracted to see if all the issues are really fixed.
The ffmpeg testsuite is successful on mips64el-linux-gnu with this patch applied.
Author: sje Date: Fri Oct 23 15:56:15 2015 New Revision: 229259 URL: https://gcc.gnu.org/viewcvs?rev=229259&root=gcc&view=rev Log: 2015-10-23 Steve Ellcey <sellcey@imgtec.com> Andrew Pinski <apinski@cavium.com> PR rtl-optimization/67736 * combine.c (simplify_comparison): Use gen_lowpart_or_truncate instead of gen_lowpart. Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c
Author: sje Date: Fri Oct 23 15:58:33 2015 New Revision: 229260 URL: https://gcc.gnu.org/viewcvs?rev=229260&root=gcc&view=rev Log: 2015-10-23 Steve Ellcey <sellcey@imgtec.com> Andrew Pinski <apinski@cavium.com> PR rtl-optimization/67736 * gcc.dg/torture/pr67736.c: New test. * gcc.dg/combine-subregs.c: New test. Added: trunk/gcc/testsuite/gcc.dg/combine-subregs.c trunk/gcc/testsuite/gcc.dg/torture/pr67736.c Modified: trunk/gcc/testsuite/ChangeLog
Author: sje Date: Mon Nov 2 21:04:33 2015 New Revision: 229677 URL: https://gcc.gnu.org/viewcvs?rev=229677&root=gcc&view=rev Log: 2015-11-02 Steve Ellcey <sellcey@imgtec.com> Backport from mainline 2015-10-23 Steve Ellcey <sellcey@imgtec.com> Andrew Pinski <apinski@cavium.com> PR rtl-optimization/67736 * combine.c (simplify_comparison): Use gen_lowpart_or_truncate instead of gen_lowpart. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/combine.c
Author: sje Date: Mon Nov 2 21:08:02 2015 New Revision: 229678 URL: https://gcc.gnu.org/viewcvs?rev=229678&root=gcc&view=rev Log: 2015-11-02 Steve Ellcey <sellcey@imgtec.com> 2015-10-23 Steve Ellcey <sellcey@imgtec.com> Andrew Pinski <apinski@cavium.com> PR rtl-optimization/67736 * gcc.dg/torture/pr67736.c: New test. * gcc.dg/combine-subregs.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.dg/combine-subregs.c branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr67736.c Modified: branches/gcc-5-branch/gcc/testsuite/ChangeLog
Patch checked in on ToT for 6.0 and on 5.* branch.