extern "C" void abort(void); unsigned bar(void) { return 32768;} void foobar (unsigned u) { if (u != (32768 | 65536)) abort (); } const char *name (void) { return "QPopupMenu"; } int main() { unsigned nStyle = bar (); const char *pClassName = name (); if (__builtin_strcmp ("QPopupMenu", pClassName) == 0) { bool is_enabled = bool (nStyle & 1); if (nStyle & 32768) nStyle |= 65536; foobar(nStyle); } return 0; } is miscompiled at -O1 and -O2 on ppc32 (ppc64 is ok).
SVN head r117970 fails as well as current 4.1 head.
Shorter testcase for gcc.c-torture/execute: extern void abort(void); unsigned int bar(void) { return 32768; } int main() { unsigned int nStyle = bar (); if (nStyle & 32768) nStyle |= 65536; if (nStyle != (32768 | 65536)) abort (); return 0; }
combine combines (insn 10 8 11 2 (parallel [ (set (reg:SI 121) (and:SI (reg/v:SI 119 [ nStyle ]) (const_int 32768 [0x8000]))) (clobber (scratch:CC)) ]) 131 {andsi3} (insn_list:REG_DEP_TRUE 8 (nil)) (expr_list:REG_UNUSED (scratch:CC) (nil))) (insn 11 10 12 2 (set (reg:CC 122) (compare:CC (reg:SI 121) (const_int 0 [0x0]))) 456 {*cmpsi_internal1} (insn_list:REG_DEP_TRUE 10 (nil)) (expr_list:REG_DEAD (reg:SI 121) (nil))) to (note 10 8 11 2 NOTE_INSN_DELETED) (insn 11 10 12 2 (parallel [ (set (reg:CC 122) (compare:CC (zero_extract:SI (reg/v:SI 119 [ nStyle ]) (const_int 1 [0x1]) (const_int 16 [0x10])) (const_int 0 [0x0]))) (clobber (scratch:SI)) ]) 157 {*extzvsi_internal1} (insn_list:REG_DEP_TRUE 8 (nil)) (expr_list:REG_UNUSED (scratch:SI) (nil))) then ifcvt comes along and produces IF-THEN block found, pass 1, start block 2 [insn 5], then 3 [14], join 4 [17] Conversion succeeded on pass 1. IF-THEN block found, pass 1, start block 2 [insn 5], then 5 [24], join 6 [28] IF-CASE-2 found, start 2, else 6 deleted 1 dead insns 3 possible IF blocks searched. 1 IF blocks converted. 2 true changes made. (insn 8 7 10 2 (set (reg/v:SI 119 [ nStyle ]) (reg:SI 3 3)) 321 {*movsi_internal1} (insn_list:REG_DEP_TRUE 7 (nil)) (expr_list:REG_DEAD (reg:SI 3 3) (nil))) (note 10 8 20 2 NOTE_INSN_DELETED) (insn 20 10 21 2 (set (reg:SI 123) (const_int 98304 [0x18000])) 321 {*movsi_internal1} (nil) (nil)) split then splits the constant load of reg:SI 123 (!?) so we end up with main: stwu %r1,-16(%r1) mflr %r0 stw %r0,20(%r1) bl bar lis %r0,0x1 ori %r0,%r0,32768 cmpw %cr7,%r3,%r0 beq+ %cr7,.L6 bl abort .L6: li %r3,0 lwz %r0,20(%r1) mtlr %r0 addi %r1,%r1,16 blr and indeed, disabling if-conversion fixes the problem.
/* Optimize away "if (x & C) x |= C" and similar bit manipulation transformations. */ static int noce_try_bitop (struct noce_if_info *if_info) { ... /* ??? We could also handle AND here. */ if (GET_CODE (cond) == ZERO_EXTRACT) { if (XEXP (cond, 1) != const1_rtx || GET_CODE (XEXP (cond, 2)) != CONST_INT || ! rtx_equal_p (x, XEXP (cond, 0))) return FALSE; bitnum = INTVAL (XEXP (cond, 2)); mode = GET_MODE (x); if (bitnum >= HOST_BITS_PER_WIDE_INT) return FALSE; ... which is wrong for little-endian bitfieds ... /* if ((x & C) == 0) x |= C; is transformed to x |= C. */ /* if ((x & C) != 0) x |= C; is transformed to nothing. */ if (GET_CODE (a) == IOR) result = (code == NE) ? a : NULL_RTX; else if (code == NE) { /* if ((x & C) == 0) x ^= C; is transformed to x |= C. */ result = gen_int_mode ((HOST_WIDE_INT) 1 << bitnum, mode); result = simplify_gen_binary (IOR, mode, x, result); } else { /* if ((x & C) != 0) x ^= C; is transformed to x &= ~C. */ result = gen_int_mode (~((HOST_WIDE_INT) 1 << bitnum), mode); result = simplify_gen_binary (AND, mode, x, result); }
In case someone is still doubtful. :-)
Note that we still cannot do this transformation on the tree-level. There were some patches for that though: http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00390.html
Just from looking at various places which handle ZERO_EXTRACT this seems to by used highly inconsistent. E.g.: rtlanal:nonzero_bits1: Doesn't look at BITS_BIG_ENDIAN or BYTES_BIG_ENDIAN at all, but does use the bitpos to generate a mask. combine.c:find_split_point: When the destination is a zero_extract, it does adjust the bitpos when BITS_BIG_ENDIAN, but it doesn't look at BYTES_BIT_ENDIAN at all. ifcvt.c:noce_emit_move_insn: Does look at both BITS_BIG_ENDIAN _and_ BYTES_BIG_ENDIAN, and only adjusts the bitpos when both are different. A comment indicates that this is required because store_bit_field() handles the bitpos even more inconsistently :-/ So ifcvt.c:noce_try_bitop (where we think the bug is) actually uses the same method as rtlanal:nonzero_bits1, so either both are wrong or both are right. From the documentation I would be inclined to think that both are wrong. But in that case I bet there are even more places which need a carefull look :-/
At least this patch fixes the bug at hand, but I'm sceptical if by chance or for real: Index: ifcvt.c =================================================================== --- ifcvt.c (revision 118648) +++ ifcvt.c (working copy) @@ -1942,6 +1942,8 @@ noce_try_bitop (struct noce_if_info *if_ return FALSE; bitnum = INTVAL (XEXP (cond, 2)); mode = GET_MODE (x); + if (BITS_BIG_ENDIAN) + bitnum = GET_MODE_BITSIZE (mode) - 1 - bitnum; if (bitnum >= HOST_BITS_PER_WIDE_INT) return FALSE; }
(In reply to comment #7) > combine.c:find_split_point: When the destination is a zero_extract, it does > adjust the bitpos when BITS_BIG_ENDIAN, but it doesn't look at > BYTES_BIT_ENDIAN at all. I think this is ok, because when zero/sign_extact operates on memory, its mode must be a single-byte integer mode.
Yes, I think all uses in combine.c are okay. In addition also the occurrence in rtlanal.c is okay, as it doesn't use the bitpos, but the width in bits to generate the mask, I just misread that part. I now looked at all other occurrences of ZERO_EXTRACT in gcc/*.c (not the machdep files or at other languages, though), and I didn't find any other which were clearly wrong.
This is a regression because I know that code is new, Roger added it ....
PR 9814.
Grr. Yep, Michael's BITS_BIG_ENDIAN test looks to be the correct thing. The effect of BITS_BIG_ENDIAN on sign_extract and zero_extract is documented in rtl.texi. Is anyone bootstrapping and regression testing this change or shall I do the honors. In fact, I'll pre-approve it.
I have started a bootstrap & regtest on {i386,ia64,ppc,ppc64,s390,s390x,x86_64}-linux on the 4.1 branch.
The patch fails bootstrap in stage2 for ppc (only). stage2/xgcc -Bstage2/ -B/usr/powerpc64-suse-linux/bin/ -c -O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -U_FORTIFY_SOURCE -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wold-style-definition -Wmissing-format-attribute -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include ../../gcc/except.c -o except.o ../../gcc/df.c: In function 'df_analyze_subcfg': ../../gcc/df.c:2550: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugs.opensuse.org> for instructions. make[2]: *** [df.o] Error 1 or stage2/xgcc -Bstage2/ -B/usr/powerpc64-suse-linux/bin/ -c -O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -U_FORTIFY_SOURCE -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wold-style-definition -Wmissing-format-attribute -Wno-error -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -DSTANDARD_STARTFILE_PREFIX=\"../../../\" -DSTANDARD_EXEC_PREFIX=\"/usr/lib/gcc/\" -DSTANDARD_LIBEXEC_PREFIX=\"/usr/lib/gcc/\" -DDEFAULT_TARGET_VERSION=\"4.1.2\" -DDEFAULT_TARGET_MACHINE=\"powerpc64-suse-linux\" -DSTANDARD_BINDIR_PREFIX=\"/usr/bin/\" -DTOOLDIR_BASE_PREFIX=\"../../../../\" `test "X${SHLIB_LINK}" = "X" || test "yes" != "yes" || echo "-DENABLE_SHARED_LIBGCC"` `test "X${SHLIB_MULTILIB}" = "X" || echo "-DNO_SHARED_LIBGCC_MULTILIB"` \ -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include ../../gcc/java/jvspec.c -o jvspec.o) ../../gcc/java/jvspec.c:61: warning: string length '834' is greater than the length '509' ISO C89 compilers are required to support ../../gcc/java/jcf-write.c: In function 'write_classfile': ../../gcc/java/jcf-write.c:3558: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugs.opensuse.org> for instructions.
(In reply to comment #15) > The patch fails bootstrap in stage2 for ppc (only). This bootstraps just fine for me on the mainline with powerpc-darwin.
(In reply to comment #16) > (In reply to comment #15) > > The patch fails bootstrap in stage2 for ppc (only). > This bootstraps just fine for me on the mainline with powerpc-darwin. And there were no regressions.
Subject: Bug 29797 Author: sayle Date: Mon Nov 13 00:41:53 2006 New Revision: 118740 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118740 Log: 2006-11-12 Michael Matz <matz@suse.de> Roger Sayle <roger@eyesopen.com> PR rtl-optimization/29797 * ifcvt.c (noce_try_bitop): Correct calculation of bitnum on BITS_BIG_ENDIAN targets. * gcc.c-torture/execute/pr29797-1.c: New test case. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/ifcvt.c trunk/gcc/testsuite/ChangeLog
Bootstrapping / regtesting a backported patch (to 4.1) is fine on ppc.
Subject: Bug 29797 Author: rguenth Date: Wed Nov 15 08:11:59 2006 New Revision: 118846 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118846 Log: 2006-11-15 Richard Guenther <rguenther@suse.de> Backport from mainline: 2006-11-12 Michael Matz <matz@suse.de> Roger Sayle <roger@eyesopen.com> PR rtl-optimization/29797 * ifcvt.c (noce_try_bitop): Correct calculation of bitnum on BITS_BIG_ENDIAN targets. * gcc.c-torture/execute/pr29797-1.c: New test case. Added: branches/gcc-4_1-branch/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c - copied unchanged from r118740, trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c Modified: branches/gcc-4_1-branch/gcc/ChangeLog branches/gcc-4_1-branch/gcc/ifcvt.c branches/gcc-4_1-branch/gcc/testsuite/ChangeLog
Subject: Bug 29797 Author: rguenth Date: Wed Nov 15 08:37:38 2006 New Revision: 118847 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118847 Log: 2006-11-15 Richard Guenther <rguenther@suse.de> Backport from mainline: 2006-11-12 Michael Matz <matz@suse.de> Roger Sayle <roger@eyesopen.com> PR rtl-optimization/29797 * ifcvt.c (noce_try_bitop): Correct calculation of bitnum on BITS_BIG_ENDIAN targets. * gcc.c-torture/execute/pr29797-1.c: New test case. Added: branches/gcc-4_2-branch/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c - copied unchanged from r118740, trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/ifcvt.c branches/gcc-4_2-branch/gcc/testsuite/ChangeLog
And 4.2. Fixed.