RFA: Don't ICE on out-of-range array accesses

Richard Sandiford rsandifo@redhat.com
Wed Jan 26 23:14:00 GMT 2005


This patch fixes the MIPS n32 gcc.dg/noncompile/920507-1.c failures.
The testcase used to look like this before tree-ssa:

    void
    x(void)
    {
     register int *a asm("unknown_register");  /* { dg-error "invalid register" } */
     int *v[1] = {a};
    }

but tree-ssa changed it as follows (presumably to avoid a warning):

    int *
    x(void)
    {
     register int *a asm("unknown_register");  /* { dg-error "invalid register" } */
     int *v[1] = {a};
     return v[1];
    }

Notice the out-of-bounds index in the return statement.  I'm not sure
whether this was deliberate or not, but it does expose a 4.0 regression,
since even the modified testcase passes with 3.4.

The problem is that elements of register arrays are accessed using
store_bit_field and extract_bit_field, but neither is guaranteed to
handle out-of-bounds register accesses in a graceful way.  The patch
below adds early-exit paths for this case.

Note that the problem doesn't trigger for the above testcase on most
targets because pointers are usually word-sized.  The out-of-range
bitfield will then start on a word boundary and will itself be
word-sized, so extract_bit_field will handle it with:

  if (((bitsize >= BITS_PER_WORD && bitsize == GET_MODE_BITSIZE (mode)
	&& bitpos % BITS_PER_WORD == 0)
       || [...]
      && ((!MEM_P (op0)
	   && TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (mode),
				     GET_MODE_BITSIZE (GET_MODE (op0)))
	   && GET_MODE_SIZE (mode1) != 0
	   && byte_offset % GET_MODE_SIZE (mode1) == 0)
          || [...]))
    {
      [...]
      return op0;
    }

The function therefore returns op0 (which is OK, since the whole thing
has undefined behaviour).

The extraction ICE happens when accessing subword fields, so the
testcase above only triggers it when pointers are smaller than a word.
However, the testcase below fails on all MIPS targets I've tried,
and I imagine it would fail for other backends too.

The trickiest thing about the patch is making sure that SUBREGs are
handled correctly.  It's tricky not because the right behaviour is
hard to fathom, but because the current SUBREG code is very odd...

  - First of all, store_bit_field starts with:

      while (GET_CODE (op0) == SUBREG)
        {
          /* [...] */
          offset += (SUBREG_BYTE (op0) / UNITS_PER_WORD);
          /* We used to adjust BITPOS here, but now we do the whole adjustment
             right after the loop.  */
          op0 = SUBREG_REG (op0);
        }

    but as far as I can tell, we do no such thing.  The code above sets op0
    to the SUBREG_REG and nothing else in the function ever checks whether
    the original op0 (i.e., str_rtx) was a SUBREG or not.

    And besides, _what_ loop?  Lots of cases aren't handled with a
    loop at all.

    I'm sure this all makes sense if you do the archaeology, but it
    looks very bogus now.

  - Each function uses three variables to store the bit position.
    The original, caller-supplied bit position is "bitnum", from which
    we get "bitpos" (== bitnum % unit) and "offset" (== bitnum / unit).

    However, the SUBREG handling in store_bit_field only adjusts "offset"
    and the extract_bit_field code only adjusts "offset" and "bitpos".
    Both versions leave "bitnum" alone.  This seems wrong since later
    code does use "bitnum" as well as the two variables derived from it.

The patch tries to clean this up by making the SUBREG code adjust
"bitnum" and by moving the calculation of "bitpos" and "offset" later
in the function, just before the point at which they are used.

Bootstrapped & regression tested on mips-sgi-irix6.5.  OK to install?

Richard


	* expmed.c (store_bit_field): Make the SUBREG code adjust bitnum.
	Set bitpos and offset later in the function.  Do nothing if the
	target is a register and if the bitfield lies completely outside
	that register.
	(extract_bit_field): Make the same SUBREG, bitpos and offset changes
	here.  Return an uninitialised register if the source value is stored
	in a register and the bitfield lies completely outside that register.

Index: expmed.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expmed.c,v
retrieving revision 1.216
diff -u -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.216 expmed.c
--- expmed.c	24 Jan 2005 11:59:40 -0000	1.216
+++ expmed.c	26 Jan 2005 17:16:54 -0000
@@ -337,8 +337,7 @@ store_bit_field (rtx str_rtx, unsigned H
 {
   unsigned int unit
     = (MEM_P (str_rtx)) ? BITS_PER_UNIT : BITS_PER_WORD;
-  unsigned HOST_WIDE_INT offset = bitnum / unit;
-  unsigned HOST_WIDE_INT bitpos = bitnum % unit;
+  unsigned HOST_WIDE_INT offset, bitpos;
   rtx op0 = str_rtx;
   int byte_offset;
   rtx orig_value;
@@ -352,12 +351,16 @@ store_bit_field (rtx str_rtx, unsigned H
 	 meaningful at a much higher level; when structures are copied
 	 between memory and regs, the higher-numbered regs
 	 always get higher addresses.  */
-      offset += (SUBREG_BYTE (op0) / UNITS_PER_WORD);
-      /* We used to adjust BITPOS here, but now we do the whole adjustment
-	 right after the loop.  */
+      bitnum += SUBREG_BYTE (op0) * BITS_PER_UNIT;
       op0 = SUBREG_REG (op0);
     }
 
+  /* No action is needed if the target is a register and if the field
+     lies completely outside that register.  This can occur if the source
+     code contains an out-of-bounds access to a small array.  */
+  if (REG_P (op0) && bitnum >= GET_MODE_BITSIZE (GET_MODE (op0)))
+    return value;
+
   /* Use vec_set patterns for inserting parts of vectors whenever
      available.  */
   if (VECTOR_MODE_P (GET_MODE (op0))
@@ -419,6 +422,8 @@ store_bit_field (rtx str_rtx, unsigned H
      done with a simple store.  For targets that support fast unaligned
      memory, any naturally sized, unit aligned field can be done directly.  */
 
+  offset = bitnum / unit;
+  bitpos = bitnum % unit;
   byte_offset = (bitnum % BITS_PER_WORD) / BITS_PER_UNIT
                 + (offset * UNITS_PER_WORD);
 
@@ -1064,8 +1069,7 @@ extract_bit_field (rtx str_rtx, unsigned
 {
   unsigned int unit
     = (MEM_P (str_rtx)) ? BITS_PER_UNIT : BITS_PER_WORD;
-  unsigned HOST_WIDE_INT offset = bitnum / unit;
-  unsigned HOST_WIDE_INT bitpos = bitnum % unit;
+  unsigned HOST_WIDE_INT offset, bitpos;
   rtx op0 = str_rtx;
   rtx spec_target = target;
   rtx spec_target_subreg = 0;
@@ -1080,15 +1084,16 @@ extract_bit_field (rtx str_rtx, unsigned
 
   while (GET_CODE (op0) == SUBREG)
     {
-      bitpos += SUBREG_BYTE (op0) * BITS_PER_UNIT;
-      if (bitpos > unit)
-	{
-	  offset += (bitpos / unit);
-	  bitpos %= unit;
-	}
+      bitnum += SUBREG_BYTE (op0) * BITS_PER_UNIT;
       op0 = SUBREG_REG (op0);
     }
 
+  /* If we have an out-of-bounds access to a register, just return an
+     uninitialised register of the required mode.  This can occur if the
+     source code contains an out-of-bounds access to a small array.  */
+  if (REG_P (op0) && bitnum >= GET_MODE_BITSIZE (GET_MODE (op0)))
+    return gen_reg_rtx (tmode);
+
   if (REG_P (op0)
       && mode == GET_MODE (op0)
       && bitnum == 0
@@ -1182,6 +1187,8 @@ extract_bit_field (rtx str_rtx, unsigned
      can also be extracted with a SUBREG.  For this, we need the
      byte offset of the value in op0.  */
 
+  bitpos = bitnum % unit;
+  offset = bitnum / unit;
   byte_offset = bitpos / BITS_PER_UNIT + offset * UNITS_PER_WORD;
 
   /* If OP0 is a register, BITPOS must count within a word.
diff -u /dev/null testsuite/gcc.c-torture/compile/20050125-1.c
--- /dev/null	Fri Apr 23 00:21:55 2004
+++ testsuite/gcc.c-torture/compile/20050125-1.c	Wed Jan 26 17:09:42 2005
@@ -0,0 +1,8 @@
+unsigned short foo (int n)
+{
+  unsigned short u[1] = { 1 };
+  u[0] = 0;
+  u[1] = 1;
+  u[2] = 2;
+  return u[0] + u[1] + u[2];
+}



More information about the Gcc-patches mailing list