This testcase is miscompiled by g++ -O2 on ia64: ----------------------------------------------- typedef long int intptr_t; struct A { void* mem1; void* mem2; }; static int off1() { return (intptr_t)(&(((struct A*)0x0)->mem1) ); } static int off2() { return (intptr_t)(&(((struct A*)0x0)->mem2) ); } int func( int length) { int off = off2() + 8; if( length ) off = off1() + 4; if( 4 == (off & 4) ) off += 4; return off; } extern "C" void abort (void); int main(void) { if (func(0) == 0) abort(); } ----------------------------------------------- The two functions must be a bit complicated that not already the tree optimizers optimize everything to a constant. For the same reason __builtin_offsetof also hides the bug. The C compiler also optimizes those two functions too early, so the code as it goes into the combiner is already Okay. But with g++ you can see the error in combine. The problem is, that the following instructions: (insn 60 12 31 0 (set (reg/v:SI 341 [ off ]) (if_then_else:SI (ne (reg:BI 344) (const_int 0 [0x0])) (const_int 4 [0x4]) (const_int 16 [0x10]))) 249 {*cmovsi_internal} (expr_list:REG_DEAD (reg:BI 344) (insn 31 60 32 0 (set (reg:DI 345) (and:DI (subreg:DI (reg/v:SI 341 [ off ]) 0) (const_int 4 [0x4]))) 228 {anddi3} (insn 32 31 33 0 (set (reg:SI 346) (subreg:SI (reg:DI 345) 0)) 4 {*movsi_internal} (expr_list:REG_DEAD (reg:DI 345) (insn 33 32 34 0 (set (reg:BI 347) (eq:BI (reg:SI 346) (const_int 0 [0x0]))) 233 {*cmpsi_normal} (expr_list:REG_DEAD (reg:SI 346) in endeffect are combined into: (insn 33 32 34 0 (set (reg/v:SI 341 [ off ]) (if_then_else:SI (ne (reg:BI 344) (const_int 0 [0x0])) (const_int 4 [0x4]) (const_int 0 [0x0]))) 249 {*cmovsi_internal} Note especially that pseudo 341 was not dead in the first sequence and it's value is indeed used later (it's simply returned). Without that combination r341 will have values {4,16}, after that {4,0}. What the combiner tries to do, is to combine that conditional set and that (and 4), which indeed would correctly result in the above {4,0} conditional set. The problem is, that an instruction is created which modifies the r341, which it shouldn't. I debugged this a bit and it's not a simple problem of combine not noticing that r341 is necessary later (it recognizes that), but instead it's an RTL sharing problem in subst(). Reading the code a bit I think combine.c has multiple RTL sharing problems, but here is the issue at hand: try_combine() is called with these instructions in the call which finally does the invalid transformation: i1: (insn 60 12 31 0 (set (reg/v:SI 341 [ off ]) (if_then_else:SI (ne (reg:BI 344) (const_int 0 [0x0])) (const_int 4 [0x4]) (const_int 16 [0x10]))) i2: (insn 33 32 34 0 (set (reg:BI 347) (eq:BI (zero_extract:DI (subreg:DI (reg/v:SI 341 [ off ]) 0) (const_int 1 [0x1]) (const_int 2 [0x2])) (const_int 0 [0x0]))) i3: (jump_insn 34 33 36 0 (set (pc) (if_then_else (ne (reg:BI 347) (const_int 0 [0x0])) (label_ref 39) (pc))) 242 {*br_true} In the course of it's action to modify i3, it will somewhen have modified i1 (!) into: (insn 60 12 31 0 (set (reg/v:SI 341 [ off ]) (if_then_else:SI (ne (reg:BI 344) (const_int 0 [0x0])) (const_int 4 [0x4]) (const_int 0 [0x0]))) This is because i1src is given uncopied to subst(). Anyway, i1src starts as expected as (if_then_else:SI (ne (reg:BI 344) (const_int 0 [0x0])) (const_int 4 [0x4]) (const_int 16 [0x10])) Then, in line 2216 it will be used in a subst call: /* If we already got a failure, don't try to do more. Otherwise, try to substitute in I1 if we have it. */ if (i1 && GET_CODE (newpat) != CLOBBER) { /* Before we can do this substitution, we must redo the test done above (see detailed comments there) that ensures that I1DEST isn't mentioned in any SETs in NEWPAT that are field assignments. */ if (! combinable_i3pat (NULL_RTX, &newpat, i1dest, NULL_RTX, 0, (rtx*) 0)) { undo_all (); return 0; } n_occurrences = 0; subst_low_cuid = INSN_CUID (i1); newpat = subst (newpat, i1dest, i1src, 0, 0); substed_i1 = 1; } What combine tries to do, is also to fold i1 into the resulting instruction which it is in the process of construction. i1src still points right into the i1 pattern, i.e. is not a copy of it. The problem is, that in some situations subst() not only changes the pattern where it substitutes something into, but also the TO argument. I'm fairly sure that this was not as initially designed. But the code in subst does not prevent this. We have: subst (x, from, to) (walk over XEXPs of x) if (COMBINE_RTX_EQUAL_P (XEXP (x, i), from)) .... new = (unique_copy && n_occurrences ? copy_rtx (to) : to); SUBST (XEXP (x, i), new); x = combine_simplify_rtx (x); Note how the 'to' parameter is directly set into the 'x' expression. Even if unique_copy is set this will be prevented only for the second and further substitutions. This means, that now 'x' and 'i1' (from the caller try_combine) share the i1src expression. Now combine_simplify_rtx() comes (some levels up in this recursive invocation of subst()) and tramples all over x, in the process changing something inside the i1src, thereby also changing i1 invalidly. Boom. Now, I have looked at the other calls of subst() and I don't see how they in turn prevent such thing from happening. If they don't use the non-problematic pc_rtx they happily us i2src, i1src or 'from' (in simplify_if_then_else()) potentially changing i2, i1 or something entirely different behind the back. There are some copy_rtx calls but they don't seem to have to do with this special behaviour of 'subst'. Especially the copying of i2pat seems unrelated and actually is protecting against wanted side effects of subst.
(intptr_t)(&(((struct A*)0x0)->mem1) ); That is undefined.
> For the same reason __builtin_offsetof also hides the bug. Because it gets lowered correctly to a constant while "(intptr_t)(&(((struct A*)0x0)->mem1) )" is not a constant. Why is someone using that method of offsetof?
Created attachment 11508 [details] possible patch I was wrong about simplify_if_then_else, it uses pc_rtx in all subst calls. But that leaves three others which use their TO argument unprotected. The patch here solves the miscompilation for me. As of yet, untested otherwise.
Andrew, I know that this is undefined. Let us ignore this issue for this bug. I tried some time to come up with two nicer functions which still get inlined but are not optimized too early to hide the bug, but failed. Probably the testcase can be made using __builtin_offset, and then switching off certain tree-optimizer passes, but I didn't try. Debugging combine was painful enough :-)
If a combination is successful, we will delete i1 and i2, so it doesn't matter if they changed accidentally. If a combination fails, then we go through undobuf and revert all changes, so it doesn't matter if i1 or i2 changed accidentally. They will be restored to their original values when we are done. The only case where it matters is if added_sets_2 or added_sets_1 is true. In this case, we will re-add the patterns from i1 and/or i2 after the combination. So in this case, we need to make sure we still have the original patterns. We already make a copy of the i2 pattern for this purpose. i2pat is a copy, and is only used if added_sets_2 is true. So we just need to do the same for i1pat. I agree the comment before the i2pat is a bit confusing. It looks like the copy was originally added to fix an obscure problem, but it also happens to fix this one too. It looks like there is an unnecessary gen_rtx_SET call, as i2pat will only be used if added_sets_2 is true. So the code setting i2pat should be moved inside the added_sets_2 if statement. The new i1pat code should work the same way. The actual modification of the if_then_else rtl happens inside force_to_mode, as called from simplify_and_const_int. See the uses of SUBST in the if_then_else case in force_to_mode. This problem could be fixed if we generated new rtl here instead of using SUBST, but I don't think that is helpful, as there are other places that also call SUBST. It is safer to make the i1pat copy.
Please indicate the version(s) of the compiler, whether it's a regression, etc.
Subject: Bug 27761 Author: jakub Date: Tue Dec 12 15:03:39 2006 New Revision: 119785 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119785 Log: PR rtl-optimization/27761 * combine.c (try_combine): Don't create a useless garbage SET if PATTERN (i2) is a PARALLEL. If added_sets_1, save PATTERN (i1) resp. SET from i1src to i1dest in i1pat and use it to prevent accidental modification of i1src. Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c
Subject: Bug 27761 Author: jakub Date: Tue Dec 12 15:05:08 2006 New Revision: 119786 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119786 Log: PR rtl-optimization/27761 * combine.c (try_combine): Don't create a useless garbage SET if PATTERN (i2) is a PARALLEL. If added_sets_1, save PATTERN (i1) resp. SET from i1src to i1dest in i1pat and use it to prevent accidental modification of i1src. Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/combine.c
Subject: Bug 27761 Author: jakub Date: Tue Dec 12 15:07:23 2006 New Revision: 119787 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119787 Log: PR rtl-optimization/27761 * combine.c (try_combine): Don't create a useless garbage SET if PATTERN (i2) is a PARALLEL. If added_sets_1, save PATTERN (i1) resp. SET from i1src to i1dest in i1pat and use it to prevent accidental modification of i1src. Modified: branches/gcc-4_1-branch/gcc/ChangeLog branches/gcc-4_1-branch/gcc/combine.c
Fixed.