[Bug tree-optimization/53636] New: SLP may create invalid unaligned memory accesses

uweigand at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Jun 11 17:22:00 GMT 2012


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53636

             Bug #: 53636
           Summary: SLP may create invalid unaligned memory accesses
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: uweigand@gcc.gnu.org


The following test case:

void test (unsigned char *dst)
{
 short tmp[11 * 8], *tptr;
 int i;

 fill (tmp);

 tptr = tmp;
 for (i = 0; i < 8; i++)
   {
     dst[0] = (-tptr[0] + 9 * tptr[0 + 1] + 9 * tptr[0 + 2] - tptr[0 + 3]) >>
7;
     dst[1] = (-tptr[1] + 9 * tptr[1 + 1] + 9 * tptr[1 + 2] - tptr[1 + 3]) >>
7;
     dst[2] = (-tptr[2] + 9 * tptr[2 + 1] + 9 * tptr[2 + 2] - tptr[2 + 3]) >>
7;
     dst[3] = (-tptr[3] + 9 * tptr[3 + 1] + 9 * tptr[3 + 2] - tptr[3 + 3]) >>
7;
     dst[4] = (-tptr[4] + 9 * tptr[4 + 1] + 9 * tptr[4 + 2] - tptr[4 + 3]) >>
7;
     dst[5] = (-tptr[5] + 9 * tptr[5 + 1] + 9 * tptr[5 + 2] - tptr[5 + 3]) >>
7;
     dst[6] = (-tptr[6] + 9 * tptr[6 + 1] + 9 * tptr[6 + 2] - tptr[6 + 3]) >>
7;
     dst[7] = (-tptr[7] + 9 * tptr[7 + 1] + 9 * tptr[7 + 2] - tptr[7 + 3]) >>
7;

     dst += 8;
     tptr += 11;
   }
}

when built on ARM with -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp -O
-ftree-vectorize creates code that uses a VLDR instruction to access unaligned
memory, which causes a Bus error at runtime.

The problem seems to be that the check in vect_compute_data_ref_alignment is
not enough for SLP.  Even though SLP only considers a basic blokc, the data-ref
analysis still looks at innermost loops to compute scalar evolutions.  This
results in concluding that the access "tptr[0]" is based on "tmp", which is
aligned to 8 bytes, using a step of 22 bytes.

The alignment check now only verified that the *base* is aligned.  This is OK
if we're actually vectorizing the loop.  But in the SLP case, we really need to
verify instead that the access is aligned on *every* iteration through the loop
...



More information about the Gcc-bugs mailing list