[PATCH] Fix PR78189

Richard Biener rguenther@suse.de
Thu Nov 10 08:34:00 GMT 2016


On Wed, 9 Nov 2016, Christophe Lyon wrote:

> On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
> >>
> >>> Hi Richard,
> >>>
> >>>
> >>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> >>> >
> >>> > The following fixes an oversight when computing alignment in the
> >>> > vectorizer.
> >>> >
> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >>> >
> >>> > Richard.
> >>> >
> >>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
> >>> >
> >>> >         PR tree-optimization/78189
> >>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> >>> >         alignment computation.
> >>> >
> >>> >         * g++.dg/torture/pr78189.C: New testcase.
> >>> >
> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> >>> > ===================================================================
> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> >>> > @@ -0,0 +1,41 @@
> >>> > +/* { dg-do run } */
> >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> >>> > +
> >>> > +#include <cstddef>
> >>> > +
> >>> > +struct A
> >>> > +{
> >>> > +  void * a;
> >>> > +  void * b;
> >>> > +};
> >>> > +
> >>> > +struct alignas(16) B
> >>> > +{
> >>> > +  void * pad;
> >>> > +  void * misaligned;
> >>> > +  void * pad2;
> >>> > +
> >>> > +  A a;
> >>> > +
> >>> > +  void Null();
> >>> > +};
> >>> > +
> >>> > +void B::Null()
> >>> > +{
> >>> > +  a.a = nullptr;
> >>> > +  a.b = nullptr;
> >>> > +}
> >>> > +
> >>> > +void __attribute__((noinline,noclone))
> >>> > +NullB(void * misalignedPtr)
> >>> > +{
> >>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> >>> > +  b->Null();
> >>> > +}
> >>> > +
> >>> > +int main()
> >>> > +{
> >>> > +  B b;
> >>> > +  NullB(&b.misaligned);
> >>> > +  return 0;
> >>> > +}
> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >>> > index 9346cfe..b03cb1e 100644
> >>> > --- gcc/tree-vect-data-refs.c
> >>> > +++ gcc/tree-vect-data-refs.c
> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> >>> >    base = ref;
> >>> >    while (handled_component_p (base))
> >>> >      base = TREE_OPERAND (base, 0);
> >>> > +  unsigned int base_alignment;
> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> >>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> >>> > +     adjust things to make base_alignment valid as the alignment of
> >>> > +     DR_BASE_ADDRESS.  */
> >>> >    if (TREE_CODE (base) == MEM_REF)
> >>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> >>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> >>> > -  unsigned int base_alignment = get_object_alignment (base);
> >>> > +    {
> >>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> >>> > +      base_bitpos &= (base_alignment - 1);
> >>> > +    }
> >>> > +  if (base_bitpos != 0)
> >>> > +    base_alignment = base_bitpos & -base_bitpos;
> >>> > +  /* Also look at the alignment of the base address DR analysis
> >>> > +     computed.  */
> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> >>> > +  if (base_addr_alignment > base_alignment)
> >>> > +    base_alignment = base_addr_alignment;
> >>> >
> >>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> >>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
> >>>
> >>> Since you committed this patch (r241892), I'm seeing execution failures:
> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> >>>   gcc.dg/vect/pr40074.c execution test
> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> >>> --with-fpu=neon-fp16
> >>> (using qemu as simulator)
> >>
> >> The difference is that we now vectorize the testcase with versioning
> >> for alignment (but it should never execute the vectorized variant).
> >> I need arm peoples help to understand what is wrong.
> > Hi All,
> > I will look at it.
> >
> 
> Hi,
> 
> This is causing new regressions on armeb:
>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 2 loops" 1
>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
> "vectorized 2 loops" 1

It's actually an improvement as armeb can't do unaligned loads.
Before the patch we versioned both loops for alignment (they
can't be possibly both aligned by luck).  After the patch we
align 'arr' and thus the first loop becomes known-aligned and
vectorized while the 2nd one is now known unaligned (and thus
not vectorized because we can't do unaligned accesses).

I suppose the following will fix it.  Can you test that on
some arm variants please?

It works ok on x86_64/i?86 so please commit if it fixes the issue
for you.

Thanks,
Richard.

diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
index 52fdcf6..3752600 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
@@ -71,5 +71,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
-  
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */



More information about the Gcc-patches mailing list