Bug 29797

Summary: [4.1/4.2 Regression] Miscompiles bit test / set in OpenOffice
Product: gcc Reporter: Richard Biener <rguenth>
Component: rtl-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: blocker CC: dje, ebotcazou, fang, gcc-bugs, matz, pinskia, sayle
Priority: P1 Keywords: wrong-code
Version: 4.1.2   
Target Milestone: 4.1.2   
Host: Target: powerpc32-*-*
Build: Known to work: 4.3.0
Known to fail: 4.1.2 Last reconfirmed: 2006-11-10 15:01:30
Bug Depends on:    
Bug Blocks: 9814    

Description Richard Biener 2006-11-10 12:59:35 UTC
extern "C" void abort(void);
unsigned bar(void) { return 32768;}
void foobar (unsigned u) { if (u != (32768 | 65536)) abort (); }
const char *name (void) { return "QPopupMenu"; }
int main()
{
  unsigned nStyle = bar ();
  const char *pClassName = name ();
  if (__builtin_strcmp ("QPopupMenu", pClassName) == 0)
   {
    bool is_enabled = bool (nStyle & 1);
    if (nStyle & 32768)
      nStyle |= 65536;
    foobar(nStyle);
   }
  return 0;
}

is miscompiled at -O1 and -O2 on ppc32 (ppc64 is ok).
Comment 1 Richard Biener 2006-11-10 13:01:11 UTC
SVN head r117970 fails as well as current 4.1 head.
Comment 2 Richard Biener 2006-11-10 13:10:57 UTC
Shorter testcase for gcc.c-torture/execute:

extern  void abort(void);
unsigned int bar(void) { return 32768; }
int main()
{
  unsigned int nStyle = bar ();
  if (nStyle & 32768)
    nStyle |= 65536;
  if (nStyle != (32768 | 65536))
    abort ();
  return 0;
}
Comment 3 Richard Biener 2006-11-10 13:28:02 UTC
combine combines

(insn 10 8 11 2 (parallel [
            (set (reg:SI 121)
                (and:SI (reg/v:SI 119 [ nStyle ])
                    (const_int 32768 [0x8000])))
            (clobber (scratch:CC))
        ]) 131 {andsi3} (insn_list:REG_DEP_TRUE 8 (nil))
    (expr_list:REG_UNUSED (scratch:CC)
        (nil)))

(insn 11 10 12 2 (set (reg:CC 122)
        (compare:CC (reg:SI 121)
            (const_int 0 [0x0]))) 456 {*cmpsi_internal1} (insn_list:REG_DEP_TRUE 10 (nil))
    (expr_list:REG_DEAD (reg:SI 121)
        (nil)))

to

(note 10 8 11 2 NOTE_INSN_DELETED)

(insn 11 10 12 2 (parallel [
            (set (reg:CC 122)
                (compare:CC (zero_extract:SI (reg/v:SI 119 [ nStyle ])
                        (const_int 1 [0x1])
                        (const_int 16 [0x10]))
                    (const_int 0 [0x0])))
            (clobber (scratch:SI))
        ]) 157 {*extzvsi_internal1} (insn_list:REG_DEP_TRUE 8 (nil))
    (expr_list:REG_UNUSED (scratch:SI)
        (nil)))

then ifcvt comes along and produces

IF-THEN block found, pass 1, start block 2 [insn 5], then 3 [14], join 4 [17]
Conversion succeeded on pass 1.

IF-THEN block found, pass 1, start block 2 [insn 5], then 5 [24], join 6 [28]

IF-CASE-2 found, start 2, else 6
deleted 1 dead insns

3 possible IF blocks searched.
1 IF blocks converted.
2 true changes made.

(insn 8 7 10 2 (set (reg/v:SI 119 [ nStyle ])
        (reg:SI 3 3)) 321 {*movsi_internal1} (insn_list:REG_DEP_TRUE 7 (nil))
    (expr_list:REG_DEAD (reg:SI 3 3)
        (nil)))

(note 10 8 20 2 NOTE_INSN_DELETED)

(insn 20 10 21 2 (set (reg:SI 123)
        (const_int 98304 [0x18000])) 321 {*movsi_internal1} (nil)
    (nil))

split then splits the constant load of reg:SI 123 (!?) so we end up with

main:
        stwu %r1,-16(%r1)
        mflr %r0
        stw %r0,20(%r1)
        bl bar
        lis %r0,0x1
        ori %r0,%r0,32768
        cmpw %cr7,%r3,%r0
        beq+ %cr7,.L6
        bl abort
.L6:
        li %r3,0
        lwz %r0,20(%r1)
        mtlr %r0
        addi %r1,%r1,16
        blr

and indeed, disabling if-conversion fixes the problem.
Comment 4 Richard Biener 2006-11-10 14:35:17 UTC
/* Optimize away "if (x & C) x |= C" and similar bit manipulation
   transformations.  */

static int
noce_try_bitop (struct noce_if_info *if_info)
{
...
  /* ??? We could also handle AND here.  */
  if (GET_CODE (cond) == ZERO_EXTRACT)
    {
      if (XEXP (cond, 1) != const1_rtx
          || GET_CODE (XEXP (cond, 2)) != CONST_INT
          || ! rtx_equal_p (x, XEXP (cond, 0)))
        return FALSE;
      bitnum = INTVAL (XEXP (cond, 2));
      mode = GET_MODE (x);
      if (bitnum >= HOST_BITS_PER_WIDE_INT) 
        return FALSE;
... which is wrong for little-endian bitfieds ...
      /* if ((x & C) == 0) x |= C; is transformed to x |= C.   */
      /* if ((x & C) != 0) x |= C; is transformed to nothing.  */
      if (GET_CODE (a) == IOR)
        result = (code == NE) ? a : NULL_RTX;
      else if (code == NE)
        {
          /* if ((x & C) == 0) x ^= C; is transformed to x |= C.   */
          result = gen_int_mode ((HOST_WIDE_INT) 1 << bitnum, mode);
          result = simplify_gen_binary (IOR, mode, x, result);
        }
      else
        {
          /* if ((x & C) != 0) x ^= C; is transformed to x &= ~C.  */
          result = gen_int_mode (~((HOST_WIDE_INT) 1 << bitnum), mode);
          result = simplify_gen_binary (AND, mode, x, result);
        }
Comment 5 Eric Botcazou 2006-11-10 15:01:30 UTC
In case someone is still doubtful. :-)
Comment 6 Richard Biener 2006-11-10 15:45:46 UTC
Note that we still cannot do this transformation on the tree-level.  There were some patches for that though:
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00390.html
Comment 7 Michael Matz 2006-11-10 15:47:32 UTC
Just from looking at various places which handle ZERO_EXTRACT this seems to
by used highly inconsistent.  E.g.:

rtlanal:nonzero_bits1: Doesn't look at BITS_BIG_ENDIAN or BYTES_BIG_ENDIAN at
all, but does use the bitpos to generate a mask.

combine.c:find_split_point: When the destination is a zero_extract, it does
adjust the bitpos when BITS_BIG_ENDIAN, but it doesn't look at
BYTES_BIT_ENDIAN at all.

ifcvt.c:noce_emit_move_insn: Does look at both BITS_BIG_ENDIAN _and_
BYTES_BIG_ENDIAN, and only adjusts the bitpos when both are different.
A comment indicates that this is required because store_bit_field() handles
the bitpos even more inconsistently :-/

So ifcvt.c:noce_try_bitop (where we think the bug is) actually uses the same
method as rtlanal:nonzero_bits1, so either both are wrong or both are right.
From the documentation I would be inclined to think that both are wrong.  But
in that case I bet there are even more places which need a carefull look :-/
Comment 8 Michael Matz 2006-11-10 15:51:47 UTC
At least this patch fixes the bug at hand, but I'm sceptical if by chance or
for real:

Index: ifcvt.c
===================================================================
--- ifcvt.c     (revision 118648)
+++ ifcvt.c     (working copy)
@@ -1942,6 +1942,8 @@ noce_try_bitop (struct noce_if_info *if_
        return FALSE;
       bitnum = INTVAL (XEXP (cond, 2));
       mode = GET_MODE (x);
+      if (BITS_BIG_ENDIAN)
+       bitnum = GET_MODE_BITSIZE (mode) - 1 - bitnum;
       if (bitnum >= HOST_BITS_PER_WIDE_INT)
        return FALSE;
     }
Comment 9 Andreas Schwab 2006-11-10 16:27:28 UTC
(In reply to comment #7)
> combine.c:find_split_point: When the destination is a zero_extract, it does
> adjust the bitpos when BITS_BIG_ENDIAN, but it doesn't look at
> BYTES_BIT_ENDIAN at all.

I think this is ok, because when zero/sign_extact operates on memory, its mode must be a single-byte integer mode.
Comment 10 Michael Matz 2006-11-10 16:48:21 UTC
Yes, I think all uses in combine.c are okay.  In addition also the occurrence
in rtlanal.c is okay, as it doesn't use the bitpos, but the width in bits
to generate the mask, I just misread that part.  I now looked at all other
occurrences of ZERO_EXTRACT in gcc/*.c (not the machdep files or at other
languages, though), and I didn't find any other which were clearly wrong.
Comment 11 Andrew Pinski 2006-11-10 17:28:06 UTC
This is a regression because I know that code is new, Roger added it ....
Comment 12 Andrew Pinski 2006-11-10 17:31:05 UTC
PR 9814.
Comment 13 roger 2006-11-10 19:41:39 UTC
Grr.  Yep, Michael's BITS_BIG_ENDIAN test looks to be the correct thing.  The
effect of BITS_BIG_ENDIAN on sign_extract and zero_extract is documented in
rtl.texi.  Is anyone bootstrapping and regression testing this change or shall
I do the honors.  In fact, I'll pre-approve it.
Comment 14 Richard Biener 2006-11-11 12:16:42 UTC
I have started a bootstrap & regtest on {i386,ia64,ppc,ppc64,s390,s390x,x86_64}-linux on the 4.1 branch.
Comment 15 Richard Biener 2006-11-11 14:30:28 UTC
The patch fails bootstrap in stage2 for ppc (only).

stage2/xgcc -Bstage2/ -B/usr/powerpc64-suse-linux/bin/ -c   -O2
-fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -U_FORTIFY_SOURCE -DIN_GCC
-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic
-Wno-long-long -Wno-variadic-macros -Wold-style-definition
-Wmissing-format-attribute     -DHAVE_CONFIG_H -I. -I. -I../../gcc
-I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include
../../gcc/except.c -o except.o
../../gcc/df.c: In function 'df_analyze_subcfg':
../../gcc/df.c:2550: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugs.opensuse.org> for instructions.
make[2]: *** [df.o] Error 1

or

        stage2/xgcc -Bstage2/ -B/usr/powerpc64-suse-linux/bin/ -c   -O2
-fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -U_FORTIFY_SOURCE -DIN_GCC
-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic
-Wno-long-long -Wno-variadic-macros -Wold-style-definition
-Wmissing-format-attribute   -Wno-error  -DHAVE_CONFIG_H -I. -I. -I../../gcc
-I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include
-DSTANDARD_STARTFILE_PREFIX=\"../../../\"
-DSTANDARD_EXEC_PREFIX=\"/usr/lib/gcc/\"
-DSTANDARD_LIBEXEC_PREFIX=\"/usr/lib/gcc/\"
-DDEFAULT_TARGET_VERSION=\"4.1.2\"
-DDEFAULT_TARGET_MACHINE=\"powerpc64-suse-linux\"
-DSTANDARD_BINDIR_PREFIX=\"/usr/bin/\"
-DTOOLDIR_BASE_PREFIX=\"../../../../\"  `test "X${SHLIB_LINK}" = "X" || test
"yes" != "yes" || echo "-DENABLE_SHARED_LIBGCC"` `test "X${SHLIB_MULTILIB}"
= "X" || echo "-DNO_SHARED_LIBGCC_MULTILIB"` \
                -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include
-I../../gcc/../libcpp/include  ../../gcc/java/jvspec.c -o jvspec.o)
../../gcc/java/jvspec.c:61: warning: string length '834' is greater than the
length '509' ISO C89 compilers are required to support
../../gcc/java/jcf-write.c: In function 'write_classfile':
../../gcc/java/jcf-write.c:3558: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugs.opensuse.org> for instructions.
Comment 16 Andrew Pinski 2006-11-11 20:11:10 UTC
(In reply to comment #15)
> The patch fails bootstrap in stage2 for ppc (only).
This bootstraps just fine for me on the mainline with powerpc-darwin.
Comment 17 Andrew Pinski 2006-11-12 08:10:26 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > The patch fails bootstrap in stage2 for ppc (only).
> This bootstraps just fine for me on the mainline with powerpc-darwin.
And there were no regressions.
Comment 18 Roger Sayle 2006-11-13 00:42:05 UTC
Subject: Bug 29797

Author: sayle
Date: Mon Nov 13 00:41:53 2006
New Revision: 118740

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118740
Log:
2006-11-12  Michael Matz  <matz@suse.de>
	    Roger Sayle  <roger@eyesopen.com>

	PR rtl-optimization/29797
	* ifcvt.c (noce_try_bitop): Correct calculation of bitnum on
	BITS_BIG_ENDIAN targets.

	* gcc.c-torture/execute/pr29797-1.c: New test case.


Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ifcvt.c
    trunk/gcc/testsuite/ChangeLog

Comment 19 Richard Biener 2006-11-15 07:40:49 UTC
Bootstrapping / regtesting a backported patch (to 4.1) is fine on ppc.
Comment 20 Richard Biener 2006-11-15 08:12:11 UTC
Subject: Bug 29797

Author: rguenth
Date: Wed Nov 15 08:11:59 2006
New Revision: 118846

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118846
Log:
2006-11-15  Richard Guenther  <rguenther@suse.de>

	Backport from mainline:
	2006-11-12  Michael Matz  <matz@suse.de>
	            Roger Sayle  <roger@eyesopen.com>

        PR rtl-optimization/29797
        * ifcvt.c (noce_try_bitop): Correct calculation of bitnum on
        BITS_BIG_ENDIAN targets.

        * gcc.c-torture/execute/pr29797-1.c: New test case.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c
      - copied unchanged from r118740, trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/ifcvt.c
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

Comment 21 Richard Biener 2006-11-15 08:37:50 UTC
Subject: Bug 29797

Author: rguenth
Date: Wed Nov 15 08:37:38 2006
New Revision: 118847

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118847
Log:
2006-11-15  Richard Guenther  <rguenther@suse.de>

	Backport from mainline:
	2006-11-12  Michael Matz  <matz@suse.de>
        	    Roger Sayle  <roger@eyesopen.com>

        PR rtl-optimization/29797
        * ifcvt.c (noce_try_bitop): Correct calculation of bitnum on
        BITS_BIG_ENDIAN targets.

        * gcc.c-torture/execute/pr29797-1.c: New test case.

Added:
    branches/gcc-4_2-branch/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c
      - copied unchanged from r118740, trunk/gcc/testsuite/gcc.c-torture/execute/pr29797-1.c
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/ifcvt.c
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

Comment 22 Richard Biener 2006-11-15 08:37:59 UTC
And 4.2.

Fixed.