This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, ARM] Misaligned access support for ARM Neon


Hi,

On Mon, 21 Dec 2009 12:20:12 +0000
Paul Brook <paul@codesourcery.com> wrote:

> On Friday 18 December 2009, Julian Brown wrote:
> > This is a version of the patch which doesn't attempt to resolve the
> > discrepancy between vector copies and vectorizing loads/stores
> > (thus is only intended to work in little-endian mode, leaving
> > big-endian mode as an open problem). So, vldr/vstr etc. will still
> > be used for aligned accesses, and any issues with adding semantics
> > to movmisalign<mode> are sidestepped.
> 
> I don't think this is correct. The original patch contained two hooks:
> [snip]
> - Add movmisalign. Either ignore the fact that packed structures
> break, or add yet annother hook for "misaligned vectors must be at
> least {-this-} aligned". This will not work for big-endian vectors,
> and will go away once we implement array load support.

This is a new version of the patch, which adds movmisalign patterns
for little-endian NEON, and uses a new (since the last version of the
patch was posted) target hook (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to
describe the alignments supported by NEON.

I've retained the check_effective_target_vect_element_align predicate
from an earlier version of this patch -- to inform the testsuite that
the target supports vectors aligned to the natural alignment of their
elements, rather than (check_effective_target_vect_hw_misalign)
vectors aligned to arbitrary alignments. Test cases (other than
gcc.dg/vect/vect-align-1.c and gcc.dg/vect/vect-align-2.c which use
packed accesses) have been adjusted to use the new predicate.

I've attempted to minimise the disruption to the test results this
patch produces. It seems that with the default vector width (64 bits),
several test cases (e.g. gcc.dg/vect/vect-outer-5.c) do not work
properly as-is. For that test for example, both outer loops get
vectorised for NEON. The second loop looks like this:

  /* Outer-loop 2: Not vectorizable because of dependence distance. */
  for (i = 0; i < 4; i++)
    {
      s = 0;
      for (j=0; j<N; j+=4)
        s += C[j];
      B[i+3] = B[i] + s;
    }

I believe this is only unvectorizable if the width of the vectors
used is four elements (floats in this case) or greater. NEON supports
such vectors in the vectorizer if the command-line option
-mvectorize-with-neon-quad is given: so, for this test (and several
others), I've added:

/* { dg-add-options quad_vectors } */

Which adds that option to the test invocation, making the behaviour
more similar to other targets, albeit in a NEON-specific way (I don't
know of any other CPUs which support two vector widths in quite the same
way as NEON). A (trickier) alternative might be to pass information
about the vector width for the target down to the test harness somehow,
and adjust the tests accordingly.

With these tweaks, test results for gcc/vect.exp change as follows:

New FAIL: default/gcc.sum:gcc.dg/vect/vect-109.c scan-tree-dump-times vect "Vectorizing an unaligned access" 10
New FAIL: default/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1

FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1

The new failures are due to apparently-missing patterns in the NEON
backend.

A full test run is still in progress. OK to apply, assuming that is
successful?

Thanks,

Julian

ChangeLog

    gcc/
    * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
    (TARGET_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
    (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * gcc.dg/vect/vect-50.c: Use vect_element_align instead of
    vect_hw_misalign.
    * gcc.dg/vect/vect-33.c: Likewise.
    * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
    * gcc.dg/vect/vect-42.c: Likewise.
    * gcc.dg/vect/vect-outer-5.c: Likewise.
    * gcc.dg/vect/vect-44.c: Likewise.
    * gcc.dg/vect/vect-70.c: Likewise.
    * gcc.dg/vect/vect-28.c: Likewise.
    * gcc.dg/vect/vect-109.c: Likewise.
    * gcc.dg/vect/vect-91.c: Likewise.
    * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
    * gcc.dg/vect/vect-95.c: Likewise.
    * gcc.dg/vect/vect-87.c: Likewise.
    * gcc.dg/vect/vect-96.c: Likewise.
    * gcc.dg/vect/vect-multitypes-1.c: Likewise.
    * gcc.dg/vect/vect-88.c: Likewise.
    * gcc.dg/vect/pr25413.c: Likewise.
    * gfortran.dg/vect/vect-2.f90: Likewise.
    * gfortran.dg/vect/vect-3.f90: Likewise.
    * gfortran.dg/vect/vect-4.f90: Likewise.
    * gfortran.dg/vect/vect-5.f90: Likewise.
    * gcc.dg/vect/slp-3.c: Use quad-word vectors when available.
    * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
    * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New.
    (add_options_for_quad_vectors): New.
Index: gcc/testsuite/gcc.dg/vect/vect-50.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-50.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-50.c	(working copy)
@@ -62,8 +62,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_element_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_element_align } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-33.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-33.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-33.c	(working copy)
@@ -40,5 +40,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(working copy)
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-42.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-42.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-42.c	(working copy)
@@ -64,7 +64,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-outer-5.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdio.h>
 #include <stdarg.h>
Index: gcc/testsuite/gcc.dg/vect/vect-44.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-44.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-44.c	(working copy)
@@ -68,5 +68,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-70.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-70.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-70.c	(working copy)
@@ -65,5 +65,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-28.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-28.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-28.c	(working copy)
@@ -41,5 +41,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-109.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-109.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-109.c	(working copy)
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-91.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-91.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-91.c	(working copy)
@@ -60,5 +60,5 @@ main3 ()
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-95.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-95.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-95.c	(working copy)
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-87.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-87.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-87.c	(working copy)
@@ -52,5 +52,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-96.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-96.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-96.c	(working copy)
@@ -45,5 +45,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -78,11 +79,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-88.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-88.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-88.c	(working copy)
@@ -52,5 +52,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/slp-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-3.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/slp-3.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include <stdio.h>
Index: gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr25413.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr25413.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/pr25413.c	(working copy)
@@ -33,7 +33,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vector_alignment_reachable_for_64bit } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -85,11 +86,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 159030)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -1575,6 +1575,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2411,7 +2423,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2546,6 +2558,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
@@ -3103,6 +3134,16 @@ proc add_options_for_bind_pic_locally { 
     return $flags
 }
 
+# Add to FLAGS the flags needed to enable 128-bit vectors.
+
+proc add_options_for_quad_vectors { flags } {
+    if [is-effective-target arm_neon_ok] {
+	return "$flags -mvectorize-with-neon-quad"
+    }
+
+    return $flags
+}
+
 # Return 1 if the target provides a full C99 runtime.
 
 proc check_effective_target_c99_runtime { } {
Index: gcc/testsuite/gfortran.dg/vect/vect-2.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-2.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-2.f90	(working copy)
@@ -18,5 +18,5 @@ END
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_hw_misalign } } } } } } 
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_element_align } } } } } } 
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-3.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-3.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-3.f90	(working copy)
@@ -7,8 +7,8 @@ Y = Y + A * X
 END
 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable}} } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || { ! vector_alignment_reachable} } } } }
 
Index: gcc/testsuite/gfortran.dg/vect/vect-4.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-4.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-4.f90	(working copy)
@@ -12,6 +12,6 @@ END
 ! { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { scan-tree-dump-times "accesses have the same alignment." 1 "vect" } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-5.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-5.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-5.f90	(working copy)
@@ -39,5 +39,5 @@
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 159030)
+++ gcc/config/arm/arm.c	(working copy)
@@ -224,6 +224,10 @@ static void arm_asm_trampoline_template 
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
 static rtx arm_pic_static_addr (rtx orig, rtx reg);
+static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
+						     const_tree type,
+						     int misalignment,
+						     bool is_packed);
 
 
 /* Table of machine attributes.  */
@@ -507,6 +511,10 @@ static const struct attribute_spec arm_a
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
+  arm_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Obstack for minipool constant handling.  */
@@ -8606,7 +8614,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15574,6 +15583,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15581,7 +15592,13 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21690,4 +21707,29 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+static bool
+arm_builtin_support_vector_misalignment (enum machine_mode mode,
+					 const_tree type, int misalignment,
+					 bool is_packed)
+{
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
+
+      if (is_packed)
+        return align == 1;
+
+      /* If the misalignment is unknown, we should be able to handle the access
+	 so long as it is not to a member of a packed data structure.  */
+      if (misalignment == -1)
+        return true;
+
+      /* This is probably always true.  */
+      return (misalignment % align) == 0;
+    }
+  
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md	(revision 159030)
+++ gcc/config/arm/neon.md	(working copy)
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -674,6 +675,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]