Bug 27761 - combine miscompiles
Summary: combine miscompiles
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.1.1
: P3 normal
Target Milestone: 4.1.2
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-05-24 18:16 UTC by Michael Matz
Modified: 2006-12-13 07:06 UTC (History)
3 users (show)

See Also:
Host: ia64-linux-gnu
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-06-10 02:49:49


Attachments
possible patch (425 bytes, patch)
2006-05-24 18:22 UTC, Michael Matz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Matz 2006-05-24 18:16:49 UTC
This testcase is miscompiled by g++ -O2 on ia64:

-----------------------------------------------
typedef long int intptr_t;
struct A {
  void* mem1;
  void* mem2;
};
static int off1() { return (intptr_t)(&(((struct A*)0x0)->mem1) ); }
static int off2() { return (intptr_t)(&(((struct A*)0x0)->mem2) ); }
int func( int length)
{
    int off = off2() + 8;
    if( length )
      off = off1() + 4;
    if( 4 == (off & 4) )
      off += 4;
    return off;
}
extern "C" void abort (void);
int main(void) {
 if (func(0) == 0)
   abort();
}
-----------------------------------------------

The two functions must be a bit complicated that not already the tree
optimizers optimize everything to a constant.  For the same reason
__builtin_offsetof also hides the bug.  The C compiler also optimizes those
two functions too early, so the code as it goes into the combiner is
already Okay.  But with g++ you can see the error in combine.  The problem
is, that the following instructions:

(insn 60 12 31 0 (set (reg/v:SI 341 [ off ])
        (if_then_else:SI (ne (reg:BI 344)
                (const_int 0 [0x0]))
            (const_int 4 [0x4])
            (const_int 16 [0x10]))) 249 {*cmovsi_internal}
        (expr_list:REG_DEAD (reg:BI 344)

(insn 31 60 32 0 (set (reg:DI 345)
        (and:DI (subreg:DI (reg/v:SI 341 [ off ]) 0)
            (const_int 4 [0x4]))) 228 {anddi3} 

(insn 32 31 33 0 (set (reg:SI 346)
        (subreg:SI (reg:DI 345) 0)) 4 {*movsi_internal}
    (expr_list:REG_DEAD (reg:DI 345)

(insn 33 32 34 0 (set (reg:BI 347)
        (eq:BI (reg:SI 346)
            (const_int 0 [0x0]))) 233 {*cmpsi_normal}
    (expr_list:REG_DEAD (reg:SI 346)

in endeffect are combined into:

(insn 33 32 34 0 (set (reg/v:SI 341 [ off ])
        (if_then_else:SI (ne (reg:BI 344)
                (const_int 0 [0x0]))
            (const_int 4 [0x4])
            (const_int 0 [0x0]))) 249 {*cmovsi_internal} 

Note especially that pseudo 341 was not dead in the first sequence and it's
value is indeed used later (it's simply returned).  Without that combination
r341 will have values {4,16}, after that {4,0}.  What the combiner
tries to do, is to combine that conditional set and that (and 4), which indeed
would correctly result in the above {4,0} conditional set.

The problem is, that an instruction is created which modifies the r341,
which it shouldn't.  I debugged this a bit and it's not a simple problem
of combine not noticing that r341 is necessary later (it recognizes that),
but instead it's an RTL sharing problem in subst().

Reading the code a bit I think combine.c has multiple RTL sharing problems,
but here is the issue at hand:

try_combine() is called with these instructions in the call which finally
does the invalid transformation:
i1: (insn 60 12 31 0 (set (reg/v:SI 341 [ off ])
        (if_then_else:SI (ne (reg:BI 344)
                (const_int 0 [0x0]))
            (const_int 4 [0x4])
            (const_int 16 [0x10])))
i2: (insn 33 32 34 0 (set (reg:BI 347)
        (eq:BI (zero_extract:DI (subreg:DI (reg/v:SI 341 [ off ]) 0)
                (const_int 1 [0x1])
                (const_int 2 [0x2]))
            (const_int 0 [0x0])))
i3: (jump_insn 34 33 36 0 (set (pc)
        (if_then_else (ne (reg:BI 347)
                (const_int 0 [0x0]))
            (label_ref 39)
            (pc))) 242 {*br_true}

In the course of it's action to modify i3, it will somewhen have modified
i1 (!) into:

(insn 60 12 31 0 (set (reg/v:SI 341 [ off ])
        (if_then_else:SI (ne (reg:BI 344)
                (const_int 0 [0x0]))
            (const_int 4 [0x4])
            (const_int 0 [0x0])))

This is because i1src is given uncopied to subst().  Anyway, i1src
starts as expected as 
  (if_then_else:SI (ne (reg:BI 344)
        (const_int 0 [0x0]))
    (const_int 4 [0x4])
    (const_int 16 [0x10]))
Then, in line 2216 it will be used in a subst call:

  /* If we already got a failure, don't try to do more.  Otherwise,
     try to substitute in I1 if we have it.  */

  if (i1 && GET_CODE (newpat) != CLOBBER)
    {
      /* Before we can do this substitution, we must redo the test done
         above (see detailed comments there) that ensures  that I1DEST
         isn't mentioned in any SETs in NEWPAT that are field assignments.  */

      if (! combinable_i3pat (NULL_RTX, &newpat, i1dest, NULL_RTX,
                              0, (rtx*) 0))
        {
          undo_all ();
          return 0;
        }

      n_occurrences = 0;
      subst_low_cuid = INSN_CUID (i1);
      newpat = subst (newpat, i1dest, i1src, 0, 0);
      substed_i1 = 1;
    }

What combine tries to do, is also to fold i1 into the resulting instruction
which it is in the process of construction.  i1src still points right into
the i1 pattern, i.e. is not a copy of it.  The problem is, that in some
situations subst() not only changes the pattern where it substitutes something
into, but also the TO argument.  I'm fairly sure that this was not as
initially designed.  But the code in subst does not prevent this.  We have:

subst (x, from, to)
   (walk over XEXPs of x)
      if (COMBINE_RTX_EQUAL_P (XEXP (x, i), from))
         ....
         new = (unique_copy && n_occurrences ? copy_rtx (to) : to);
      SUBST (XEXP (x, i), new);
   x = combine_simplify_rtx (x);

Note how the 'to' parameter is directly set into the 'x' expression.  Even
if unique_copy is set this will be prevented only for the second and further
substitutions.  This means, that now 'x' and 'i1' (from the caller
try_combine) share the i1src expression.  Now combine_simplify_rtx() comes
(some levels up in this recursive invocation of subst())
and tramples all over x, in the process changing something inside the
i1src, thereby also changing i1 invalidly.  Boom.

Now, I have looked at the other calls of subst() and I don't see how they
in turn prevent such thing from happening.  If they don't use the
non-problematic pc_rtx they happily us i2src, i1src or 'from' (in simplify_if_then_else()) potentially changing i2, i1 or something entirely
different behind the back.

There are some copy_rtx calls but they don't seem to have to do with this
special behaviour of 'subst'.  Especially the copying of i2pat seems unrelated
and actually is protecting against wanted side effects of subst.
Comment 1 Andrew Pinski 2006-05-24 18:17:52 UTC
(intptr_t)(&(((struct A*)0x0)->mem1) );

That is undefined.
Comment 2 Andrew Pinski 2006-05-24 18:19:23 UTC
> For the same reason __builtin_offsetof also hides the bug.  
Because it gets lowered correctly to a constant while 
"(intptr_t)(&(((struct A*)0x0)->mem1) )" is not a constant.

Why is someone using that method of offsetof?
Comment 3 Michael Matz 2006-05-24 18:22:11 UTC
Created attachment 11508 [details]
possible patch

I was wrong about simplify_if_then_else, it uses pc_rtx in all subst calls.
But that leaves three others which use their TO argument unprotected.  The
patch here solves the miscompilation for me.  As of yet, untested otherwise.
Comment 4 Michael Matz 2006-05-24 18:26:11 UTC
Andrew, I know that this is undefined.  Let us ignore this issue for this bug.
I tried some time to come up with two nicer functions which still get inlined
but are not optimized too early to hide the bug, but failed.  Probably
the testcase can be made using __builtin_offset, and then switching off
certain tree-optimizer passes, but I didn't try.  Debugging combine was
painful enough :-)
Comment 5 Jim Wilson 2006-06-10 02:49:49 UTC
If a combination is successful, we will delete i1 and i2, so it doesn't matter if they changed accidentally.

If a combination fails, then we go through undobuf and revert all changes, so it doesn't matter if i1 or i2 changed accidentally.  They will be restored to their original values when we are done.

The only case where it matters is if added_sets_2 or added_sets_1 is true.  In this case, we will re-add the patterns from i1 and/or i2 after the combination.  So in this case, we need to make sure we still have the original patterns.  We already make a copy of the i2 pattern for this purpose.  i2pat is a copy, and is only used if added_sets_2 is true.  So we just need to do the same for i1pat.

I agree the comment before the i2pat is a bit confusing.  It looks like the copy was originally added to fix an obscure problem, but it also happens to fix this one too.

It looks like there is an unnecessary gen_rtx_SET call, as i2pat will only be used if added_sets_2 is true.  So the code setting i2pat should be moved inside the added_sets_2 if statement.  The new i1pat code should work the same way.

The actual modification of the if_then_else rtl happens inside force_to_mode, as called from simplify_and_const_int.  See the uses of SUBST in the if_then_else case in force_to_mode.  This problem could be fixed if we generated new rtl here instead of using SUBST, but I don't think that is helpful, as there are other places that also call SUBST.  It is safer to make the i1pat copy.
Comment 6 Eric Botcazou 2006-09-13 08:44:15 UTC
Please indicate the version(s) of the compiler, whether it's a regression, etc.
Comment 7 Jakub Jelinek 2006-12-12 15:03:55 UTC
Subject: Bug 27761

Author: jakub
Date: Tue Dec 12 15:03:39 2006
New Revision: 119785

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119785
Log:
	PR rtl-optimization/27761
	* combine.c (try_combine): Don't create a useless garbage SET
	if PATTERN (i2) is a PARALLEL.  If added_sets_1, save
	PATTERN (i1) resp. SET from i1src to i1dest in i1pat
	and use it to prevent accidental modification of i1src.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c

Comment 8 Jakub Jelinek 2006-12-12 15:05:31 UTC
Subject: Bug 27761

Author: jakub
Date: Tue Dec 12 15:05:08 2006
New Revision: 119786

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119786
Log:
	PR rtl-optimization/27761
	* combine.c (try_combine): Don't create a useless garbage SET
	if PATTERN (i2) is a PARALLEL.  If added_sets_1, save
	PATTERN (i1) resp. SET from i1src to i1dest in i1pat
	and use it to prevent accidental modification of i1src.

Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/combine.c

Comment 9 Jakub Jelinek 2006-12-12 15:07:44 UTC
Subject: Bug 27761

Author: jakub
Date: Tue Dec 12 15:07:23 2006
New Revision: 119787

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119787
Log:
	PR rtl-optimization/27761
	* combine.c (try_combine): Don't create a useless garbage SET
	if PATTERN (i2) is a PARALLEL.  If added_sets_1, save
	PATTERN (i1) resp. SET from i1src to i1dest in i1pat
	and use it to prevent accidental modification of i1src.

Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/combine.c

Comment 10 Andrew Pinski 2006-12-13 07:06:03 UTC
Fixed.