This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFA:] Fix PR target/18701, address adjustment error with paradoxicalmem subregs. (Disallow them, I say!)


Actually small and almost trivial patches below.  Analysis
follows.  For this bug, the cut-down version of
gcc.c-torture/execute/20040218-1.c looks like this:

short int yb (short int *y) __attribute__ ((__noinline__));
short int yb (short int *y)
{
  short int xx = *y & 255;
  return xx + y[1];
}
int main (void)
{
  short int yw[] = {-1, 16000};

  if (yb (yw) != 16255)
    abort ();
  exit (0);
}

Compiled with -O2 for mmix-knuth-mmixware, in 16.life, we have
this dump for yb, modulo non-INSN_P:s (pruned for clarity):

(insn 6 9 7 0 x.i:2 (set (reg/v/f:DI 269 [ y ])
        (reg:DI 0 $0 [ y ])) 3 {movdi} (nil)
    (expr_list:REG_DEAD (reg:DI 0 $0 [ y ])
        (nil)))

(insn 11 7 12 0 x.i:2 (set (reg:HI 271)
        (mem:HI (reg/v/f:DI 269 [ y ]) [3 S2 A16])) 1 {movhi} (insn_list:REG_DEP_TRUE 6 (nil))
    (nil))

(insn 12 11 13 0 x.i:2 (set (reg:DI 272)
        (and:DI (subreg:DI (reg:HI 271) 0)
            (const_int 255 [0xff]))) 15 {anddi3} (insn_list:REG_DEP_TRUE 11 (nil))
    (expr_list:REG_DEAD (reg:HI 271)
        (nil)))

(insn 13 12 15 0 x.i:2 (set (reg:HI 273)
        (mem:HI (plus:DI (reg/v/f:DI 269 [ y ])
                (const_int 2 [0x2])) [3 S2 A16])) 1 {movhi} (nil)
    (expr_list:REG_DEAD (reg/v/f:DI 269 [ y ])
        (nil)))

(insn 15 13 16 0 x.i:2 (set (reg:DI 275)
        (plus:DI (reg:DI 272)
            (subreg:DI (reg:HI 273) 0))) 7 {adddi3} (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_TRUE 13 (nil)))
    (expr_list:REG_DEAD (reg:DI 272)
        (expr_list:REG_DEAD (reg:HI 273)
            (nil))))

(insn 16 15 17 0 x.i:2 (set (reg:HI 276)
        (subreg:HI (reg:DI 275) 6)) 1 {movhi} (insn_list:REG_DEP_TRUE 15 (nil))
    (expr_list:REG_DEAD (reg:DI 275)
        (nil)))

(insn 17 16 18 0 x.i:2 (set (reg:SI 277)
        (subreg:SI (reg:HI 276) 0)) 2 {movsi} (insn_list:REG_DEP_TRUE 16 (nil))
    (expr_list:REG_DEAD (reg:HI 276)
        (nil)))

(insn 18 17 20 0 x.i:2 (set (reg:DI 278)
        (ashift:DI (subreg:DI (reg:SI 277) 0)
            (const_int 16 [0x10]))) 22 {ashldi3} (insn_list:REG_DEP_TRUE 17 (nil))
    (expr_list:REG_DEAD (reg:SI 277)
        (nil)))

(insn 20 18 21 0 x.i:2 (set (reg:DI 279)
        (ashift:DI (reg:DI 278)
            (const_int 32 [0x20]))) 22 {ashldi3} (insn_list:REG_DEP_TRUE 18 (nil))
    (expr_list:REG_DEAD (reg:DI 278)
        (nil)))

(insn 21 20 22 0 x.i:2 (set (reg:DI 279)
        (ashiftrt:DI (reg:DI 279)
            (const_int 32 [0x20]))) 23 {ashrdi3} (insn_list:REG_DEP_TRUE 20 (nil))
    (nil))

(insn 22 21 27 0 x.i:2 (set (reg:DI 281)
        (ashiftrt:DI (reg:DI 279)
            (const_int 16 [0x10]))) 23 {ashrdi3} (insn_list:REG_DEP_TRUE 21 (nil))
    (expr_list:REG_DEAD (reg:DI 279)
        (nil)))

(insn 30 28 36 0 x.i:5 (set (reg/i:SI 0 $0 [ <result> ])
        (subreg:SI (reg:DI 281) 4)) 2 {movsi} (insn_list:REG_DEP_TRUE 22 (nil))
    (expr_list:REG_DEAD (reg:DI 281)
        (nil)))

(insn 36 30 0 0 x.i:5 (use (reg/i:SI 0 $0 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 30 (nil))
    (nil))

Presumably the SImode comes from implicit conversion to int,
which looks a bit strange, because we already have DImode
operations and HImode truncation.  Note the two memory accesses,
corresponding to y[0] and y[1] (with y[] in HImode).

In 17.combine, we then have this, with non-INSN_P:s pruned:

(insn 6 9 7 0 x.i:2 (set (reg/v/f:DI 269 [ y ])
        (reg:DI 0 $0 [ y ])) 3 {movdi} (nil)
    (expr_list:REG_DEAD (reg:DI 0 $0 [ y ])
        (nil)))

(insn 11 7 12 0 x.i:2 (set (reg:HI 271)
        (mem:HI (reg/v/f:DI 269 [ y ]) [3 S2 A16])) 1 {movhi} (insn_list:REG_DEP_TRUE 6 (nil))
    (nil))

(insn 12 11 13 0 x.i:2 (set (reg:DI 272)
        (and:DI (subreg:DI (reg:HI 271) 0)
            (const_int 255 [0xff]))) 15 {anddi3} (insn_list:REG_DEP_TRUE 11 (nil))
    (expr_list:REG_DEAD (reg:HI 271)
        (nil)))

(insn 17 16 18 0 x.i:2 (set (subreg:DI (reg:SI 277) 0)
        (plus:DI (reg:DI 272)
            (subreg:DI (mem:SI (plus:DI (reg/v/f:DI 269 [ y ])
                        (const_int 6 [0x6])) [3 S4 A16]) 0))) 7 {adddi3} (insn_list:REG_DEP_TRUE 12 (nil))
    (expr_list:REG_DEAD (reg/v/f:DI 269 [ y ])
        (expr_list:REG_DEAD (reg:DI 272)
            (nil))))

(insn 20 18 21 0 x.i:2 (set (reg:DI 279)
        (ashift:DI (subreg:DI (reg:SI 277) 0)
            (const_int 48 [0x30]))) 22 {ashldi3} (insn_list:REG_DEP_TRUE 17 (nil))
    (expr_list:REG_DEAD (reg:SI 277)
        (nil)))

(insn 30 28 36 0 x.i:5 (set (reg:DI 0 $0 [ <result> ])
        (ashiftrt:DI (reg:DI 279)
            (const_int 48 [0x30]))) 23 {ashrdi3} (insn_list:REG_DEP_TRUE 20 (nil))
    (expr_list:REG_DEAD (reg:DI 279)
        (nil)))

(insn 36 30 0 0 x.i:5 (use (reg/i:SI 0 $0 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 30 (nil))
    (nil))

(Most pesky shifts gone.  Nice.)  Here, we have the same y[0]
access, but also a *(int *) &(y + 3) (and two insns to
presumably shift the bits right) instead of the y[1].  Oops!
Not only is the address miscalculated, but it's needlessly
accessed in a wider mode.  A pointer is passed and GCC shouldn't
arbitrarily extend the access beyond the end of the object,
right?  It can do that for stack accesses, but here, it doesn't
know that the pointer is safe, because it's a parameter to the
function, and GCC knows nothing about the origin.  I don't think
pointers must be volatile for pointer accesses not to be
*widened* for arbitrary pointers.

Two erroneous transformations are done in combine (try_combine):
y1 = (nil)
y2 =
(insn 13 12 15 0 (set (reg:HI 273)
        (mem:HI (plus:DI (reg/v/f:DI 269 [ y ])
                (const_int 2 [0x2])) [3 S2 A16])) 1 {movhi} (nil)
    (expr_list:REG_DEAD (reg/v/f:DI 269 [ y ])
        (nil)))
y3 =
(insn 15 13 16 0 (set (reg:DI 275)
        (plus:DI (reg:DI 272)
            (subreg:DI (reg:HI 273) 0))) 7 {adddi3} (insn_list:REG_DEP_TRUE 12
(insn_list:REG_DEP_TRUE 13 (nil)))
    (expr_list:REG_DEAD (reg:DI 272)
        (expr_list:REG_DEAD (reg:HI 273)
            (nil))))

into:
(insn 15 13 16 0 (set (reg:DI 275)
        (plus:DI (reg:DI 272)
            (subreg:DI (mem:HI (plus:DI (reg/v/f:DI 269 [ y ])
                        (const_int 2 [0x2])) [3 S2 A16]) 0))) 7 {adddi3}
(insn_list:REG_DEP_TRUE 12 (nil))
    (expr_list:REG_DEAD (reg/v/f:DI 269 [ y ])
        (expr_list:REG_DEAD (reg:DI 272)
            (nil))))

The paradoxical subreg here is allowed by general_operand before reload is
completed, for targets without scheduling.  Highly suspect: it can't be
valid, because the address isn't valid for wider than HImode and the
offset is wrong if that'd actually be a DImode access.  The only way it
can be valid is if the subreg is supposed to be treated like a register,
with absolutely no tampering with the address or access.  But that's not
true, because we later have this transformation (again, try_combine):
i1 = (nil)
i2 =
(insn 16 15 17 0 (set (subreg:DI (reg:HI 276) 0)
        (plus:DI (reg:DI 272)
            (subreg:DI (mem:HI (plus:DI (reg/v/f:DI 269 [ y ])
                        (const_int 2 [0x2])) [3 S2 A16]) 0))) 7 {adddi3}
(insn_list:REG_DEP_TRUE 12 (nil))
    (expr_list:REG_DEAD (reg:DI 272)
        (expr_list:REG_DEAD (reg/v/f:DI 269 [ y ])
            (nil))))
i3 =
(insn 17 16 18 0 (set (reg:SI 277)
        (subreg:SI (reg:HI 276) 0)) 2 {movsi} (insn_list:REG_DEP_TRUE 16 (nil))
    (expr_list:REG_DEAD (reg:HI 276)
        (nil)))

into (insn 17) in 17.combine.
The problem seems to be with combine_simplify_rtx, which further mutilates:
(subreg:SI (subreg:HI (plus:DI (reg:DI 272)
            (subreg:DI (mem:HI (plus:DI (reg/v/f:DI 269 [ y ])
                        (const_int 2 [0x2])) [3 S2 A16]) 0)) 6) 0)
into:
(plus:DI (reg:DI 272)
    (subreg:DI (mem:SI (plus:DI (reg/v/f:DI 269 [ y ])
                (const_int 6 [0x6])) [3 S4 A16]) 0))

I'm submitting two patches: one patch to correct the address
calculation.  That works by itself.  I also submit another to
disallow paradoxical (i.e. widening) mem subregs in
combine_simplify_rtx similar to volatile accesses and
mode-dependent addresses.  That works too and seems valid given
the if-statement I patched, but apparently the bug fixed in the
first patch is the cause of the actual widened access seen above
(SImode instead of HImode).

I would like to submit a third patch.  I suggest to get rid of
the difference in handling of paradoxical subregs with regards
to INSN_SCHEDULING (actually, make the !INSN_SCHEDULING the
same).  Though I suspect it's too late in stage 3 for that last
one.  All primary and secondary targets are INSN_SCHEDULING, so
perhaps that's still ok.  For an example of what's going on, see
the #ifdef in recog.c:general_operand as mentioned above.  This
is what I moaned about in
<URL:http://gcc.gnu.org/ml/gcc/2004-12/msg00818.html> but got no
comments yet.

For each patch individually: built cross to mmix-knuth-mmixware,
fixing all test-cases in PR target/18701, no regressions.
No regressions on host i686-pc-linux-gnu (FC2).

Ok to commit?

First, with extra context for ease of review:

	PR target/18701
	* combine.c (gen_lowpart_for_combine): Re-read the mode of
	possibly-subreg-stripped x before doing offset calculations for
	address adjustment.

Index: combine.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.463
diff -p -c -c -2 -5 -p -r1.463 combine.c
*** combine.c	15 Dec 2004 01:02:55 -0000	1.463
--- combine.c	20 Dec 2004 13:38:45 -0000
*************** gen_lowpart_for_combine (enum machine_mo
*** 9351,9400 ****
--- 9351,9407 ----

    /* X might be a paradoxical (subreg (mem)).  In that case, gen_lowpart
       won't know what to do.  So we will strip off the SUBREG here and
       process normally.  */
    if (GET_CODE (x) == SUBREG && MEM_P (SUBREG_REG (x)))
      {
        x = SUBREG_REG (x);
        if (GET_MODE (x) == omode)
  	return x;
      }

    result = gen_lowpart_common (omode, x);

  #ifdef CANNOT_CHANGE_MODE_CLASS
    if (result != 0 && GET_CODE (result) == SUBREG)
      record_subregs_of_mode (result);
  #endif

    if (result)
      return result;

    if (MEM_P (x))
      {
        int offset = 0;

+       /* It's likely we had a paradoxical subreg stripped off before
+ 	 the gen_lowpart_common call above, in which case we have to
+ 	 re-read the mode of x to get the offset for the new address
+ 	 right.  */
+       imode = GET_MODE (x);
+       isize = GET_MODE_SIZE (imode);
+
        /* Refuse to work on a volatile memory ref or one with a mode-dependent
  	 address.  */
        if (MEM_VOLATILE_P (x) || mode_dependent_address_p (XEXP (x, 0)))
  	goto fail;

        /* If we want to refer to something bigger than the original memref,
  	 generate a paradoxical subreg instead.  That will force a reload
  	 of the original memref X.  */
        if (isize < osize)
  	return gen_rtx_SUBREG (omode, x, 0);

        if (WORDS_BIG_ENDIAN)
  	offset = MAX (isize, UNITS_PER_WORD) - MAX (osize, UNITS_PER_WORD);

        /* Adjust the address so that the address-after-the-data is unchanged. */
        if (BYTES_BIG_ENDIAN)
  	offset -= MIN (UNITS_PER_WORD, osize) - MIN (UNITS_PER_WORD, isize);

        return adjust_address_nv (x, omode, offset);
      }

    /* If X is a comparison operator, rewrite it in a new mode.  This
       probably won't match, but may allow further simplifications.  */
    else if (COMPARISON_P (x))
      return gen_rtx_fmt_ee (GET_CODE (x), omode, XEXP (x, 0), XEXP (x, 1));


Second, admittedly a bit simplistic, but seems valid looking at the other
things disallowed in that context:

	PR target/18701
	* combine.c (combine_simplify_rtx): Do not allow paradoxical
	subregs of MEM.

Index: combine.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.463
diff -p -c -u -p -r1.463 combine.c
--- combine.c	15 Dec 2004 01:02:55 -0000	1.463
+++ combine.c	20 Dec 2004 05:23:55 -0000
@@ -3983,10 +3983,14 @@ combine_simplify_rtx (rtx x, enum machin
       }

       /* Don't change the mode of the MEM if that would change the meaning
-	 of the address.  */
+	 of the address.  Similarly, don't allow widening, as that may
+	 access memory outside the defined object or using an address
+	 that is invalid for a wider mode.  */
       if (MEM_P (SUBREG_REG (x))
 	  && (MEM_VOLATILE_P (SUBREG_REG (x))
-	      || mode_dependent_address_p (XEXP (SUBREG_REG (x), 0))))
+	      || mode_dependent_address_p (XEXP (SUBREG_REG (x), 0))
+	      || (GET_MODE_SIZE (mode)
+		  > GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))))))
 	return gen_rtx_CLOBBER (mode, const0_rtx);

       /* Note that we cannot do any narrowing for non-constants since

brgds, H-P
PS. If paradoxical subregs are outlawed, only outlaws will have paradoxical subregs.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]