[PATCH v2] Target-specific limits on vector alignment

Richard Guenther richard.guenther@gmail.com
Mon Jul 30 11:12:00 GMT 2012


On Fri, Jul 27, 2012 at 5:24 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Richard Guenther wrote:
>> On Mon, Jun 11, 2012 at 5:25 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>> > On 11/06/12 15:53, Richard Guenther wrote:
>> >> The type argument or the size argument looks redundant.
>> >
>> > Technically, yes, we could get rid of "tree_low_cst (TYPE_SIZE (type)"
>> > and calculate it inside the alignment function if it was needed.
>> > However, it seemed likely that most targets would need that number one
>> > way or another, such that passing it would be helpful.
>>
>> Well, you don't need it in stor-layout and targets might think the value
>> may be completely unrelated to the type ...
>>
>> >> Note that we still can have such vector "properly" aligned, thus the
>> >> vectorizer would need to use build_aligned_type also if it knows the
>> >> type is aligned, not only when thinks it is misaligned.  You basically
>> >> change the alignment of the "default" vector type.
>> >
>> > I'm not sure I follow...
>>
>> I say that a large vector may be still aligned, so the vectorizer when
>> creating vector memory references has to use a non-default aligned vector
>> type when the vector is aligned.  It won't do that at the moment.
>
> Richard (Earnshaw) has asked me to take over working on this patch now.
>
> I've now made the change requested above and removed the size argument.
> The target is now simply asked to return the required alignment for the
> given vector type.  I've also added a check for the case where the
> target provides both an alignment and a mode for a vector type, but
> the mode actually requires bigger alignment than the type.  This is
> simply rejected (the target can fix this by reporting a different
> type alignment or changing the mode alignment).
>
> I've not made any attempts to have the vectorizer register larger
> alignments than the one returned by the target hook.  It's not
> clear to me when this would be useful (at least on ARM) ...
>
> I've also run the testsuite, and this actually uncovered to bugs in
> the vectorizer where it made an implicit assumption that vector types
> must always be naturally aligned:
>
> - In vect_update_misalignment_for_peel, the code used the vector size
>   instead of the required alignment in order to bound misalignment
>   values -- leading to a misalignment value bigger than the underlying
>   alignment requirement of the vector type, causing an ICE later on
>
> - In vect_do_peeling_for_loop_bound, the code divided the vector type
>   alignment by the number of elements in order to arrive at the element
>   size ... this returns a wrong value if the alignment is less than the
>   vector size, causing incorrect code to be generated
>
>   (This routine also had some confusion between size and alignment in
>   comments and variable names, which I've fixed as well.)
>
> Finally, two test cases still failed spuriously:
>
> - gcc.dg/align-2.c actually checked that vector types are naturally
>   aligned
>
> - gcc.dg/vect/slp-25.c checked that we needed to perform peeling for
>   alignment, which we actually don't need any more if vector types
>   have a lesser alignment requirement in the first place
>
> I've added a new effective target flag to check whether the target
> requires natural alignment for vector types, and disabled those two
> tests if it doesn't.
>
> With those changes, I've completed testing with no regressions on
> arm-linux-gnueabi.
>
> OK for mainline?

Ok.  Please add to the documentation that the default vector alignment
has to be a power-of-two multiple of the default vector element alignment.
You probably want to double-check vector_alignment_reachable_p as well
which checks whether vector alignment can be reached by peeling off
scalar iterations.

Thanks,
Richard.

> Bye,
> Ulrich
>
>
> ChangeLog:
>
>         * target.def (vector_alignment): New target hook.
>         * doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
>         * doc/tm.texi: Regenerate.
>         * targhooks.c (default_vector_alignment): New function.
>         * targhooks.h (default_vector_alignment): Add prototype.
>         * stor-layout.c (layout_type): Use targetm.vector_alignment.
>         * config/arm/arm.c (arm_vector_alignment): New function.
>         (TARGET_VECTOR_ALIGNMENT): Define.
>
>         * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use
>         vector type alignment instead of size.
>         * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Use
>         element type size directly instead of computing it from alignment.
>         Fix variable naming and comment.
>
> testsuite/ChangeLog:
>
>         * lib/target-supports.exp
>         (check_effective_target_vect_natural_alignment): New function.
>         * gcc.dg/align-2.c: Only run on targets with natural alignment
>         of vector types.
>         * gcc.dg/vect/slp-25.c: Adjust tests for targets without natural
>         alignment of vector types.
>
>
> Index: gcc/target.def
> ===================================================================
> *** gcc/target.def      (revision 189809)
> --- gcc/target.def      (working copy)
> *************** DEFHOOK
> *** 1659,1664 ****
> --- 1659,1672 ----
>    bool, (enum machine_mode mode),
>    hook_bool_mode_false)
>
> + DEFHOOK
> + (vector_alignment,
> +  "This hook can be used to define the alignment for a vector of type\n\
> + @var{type}, in order to comply with a platform ABI.  The default is to\n\
> + require natural alignment for vector types.",
> +  HOST_WIDE_INT, (const_tree type),
> +  default_vector_alignment)
> +
>   /* True if we should try to use a scalar mode to represent an array,
>      overriding the usual MAX_FIXED_MODE limit.  */
>   DEFHOOK
> Index: gcc/doc/tm.texi
> ===================================================================
> *** gcc/doc/tm.texi     (revision 189809)
> --- gcc/doc/tm.texi     (working copy)
> *************** make it all fit in fewer cache lines.
> *** 1107,1112 ****
> --- 1107,1118 ----
>   If the value of this macro has a type, it should be an unsigned type.
>   @end defmac
>
> + @deftypefn {Target Hook} HOST_WIDE_INT TARGET_VECTOR_ALIGNMENT (const_tree @var{type})
> + This hook can be used to define the alignment for a vector of type
> + @var{type}, in order to comply with a platform ABI.  The default is to
> + require natural alignment for vector types.
> + @end deftypefn
> +
>   @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
>   If defined, a C expression to compute the alignment for stack slot.
>   @var{type} is the data type, @var{mode} is the widest mode available,
> Index: gcc/doc/tm.texi.in
> ===================================================================
> *** gcc/doc/tm.texi.in  (revision 189809)
> --- gcc/doc/tm.texi.in  (working copy)
> *************** make it all fit in fewer cache lines.
> *** 1091,1096 ****
> --- 1091,1098 ----
>   If the value of this macro has a type, it should be an unsigned type.
>   @end defmac
>
> + @hook TARGET_VECTOR_ALIGNMENT
> +
>   @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
>   If defined, a C expression to compute the alignment for stack slot.
>   @var{type} is the data type, @var{mode} is the widest mode available,
> Index: gcc/targhooks.c
> ===================================================================
> *** gcc/targhooks.c     (revision 189809)
> --- gcc/targhooks.c     (working copy)
> *************** tree default_mangle_decl_assembler_name
> *** 945,950 ****
> --- 945,957 ----
>      return id;
>   }
>
> + /* Default to natural alignment for vector types.  */
> + HOST_WIDE_INT
> + default_vector_alignment (const_tree type)
> + {
> +   return tree_low_cst (TYPE_SIZE (type), 0);
> + }
> +
>   bool
>   default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
>   {
> Index: gcc/targhooks.h
> ===================================================================
> *** gcc/targhooks.h     (revision 189809)
> --- gcc/targhooks.h     (working copy)
> *************** extern int default_builtin_vectorization
> *** 83,88 ****
> --- 83,90 ----
>
>   extern tree default_builtin_reciprocal (unsigned int, bool, bool);
>
> + extern HOST_WIDE_INT default_vector_alignment (const_tree);
> +
>   extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
>   extern bool
>   default_builtin_support_vector_misalignment (enum machine_mode mode,
> Index: gcc/stor-layout.c
> ===================================================================
> *** gcc/stor-layout.c   (revision 189809)
> --- gcc/stor-layout.c   (working copy)
> *************** layout_type (tree type)
> *** 2131,2139 ****
>         TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
>                                             bitsize_int (nunits));
>
> !       /* Always naturally align vectors.  This prevents ABI changes
> !          depending on whether or not native vector modes are supported.  */
> !       TYPE_ALIGN (type) = tree_low_cst (TYPE_SIZE (type), 0);
>           break;
>         }
>
> --- 2131,2147 ----
>         TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
>                                             bitsize_int (nunits));
>
> !       /* For vector types, we do not default to the mode's alignment.
> !          Instead, query a target hook, defaulting to natural alignment.
> !          This prevents ABI changes depending on whether or not native
> !          vector modes are supported.  */
> !       TYPE_ALIGN (type) = targetm.vector_alignment (type);
> !
> !       /* However, if the underlying mode requires a bigger alignment than
> !          what the target hook provides, we cannot use the mode.  For now,
> !          simply reject that case.  */
> !       gcc_assert (TYPE_ALIGN (type)
> !                   >= GET_MODE_ALIGNMENT (TYPE_MODE (type)));
>           break;
>         }
>
> Index: gcc/config/arm/arm.c
> ===================================================================
> *** gcc/config/arm/arm.c        (revision 189809)
> --- gcc/config/arm/arm.c        (working copy)
> *************** static bool arm_array_mode_supported_p (
> *** 254,259 ****
> --- 254,260 ----
>                                         unsigned HOST_WIDE_INT);
>   static enum machine_mode arm_preferred_simd_mode (enum machine_mode);
>   static bool arm_class_likely_spilled_p (reg_class_t);
> + static HOST_WIDE_INT arm_vector_alignment (const_tree type);
>   static bool arm_vector_alignment_reachable (const_tree type, bool is_packed);
>   static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
>                                                      const_tree type,
> *************** static const struct attribute_spec arm_a
> *** 602,607 ****
> --- 603,611 ----
>   #undef TARGET_CLASS_LIKELY_SPILLED_P
>   #define TARGET_CLASS_LIKELY_SPILLED_P arm_class_likely_spilled_p
>
> + #undef TARGET_VECTOR_ALIGNMENT
> + #define TARGET_VECTOR_ALIGNMENT arm_vector_alignment
> +
>   #undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
>   #define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE \
>     arm_vector_alignment_reachable
> *************** arm_have_conditional_execution (void)
> *** 25051,25056 ****
> --- 25055,25072 ----
>     return !TARGET_THUMB1;
>   }
>
> + /* The AAPCS sets the maximum alignment of a vector to 64 bits.  */
> + static HOST_WIDE_INT
> + arm_vector_alignment (const_tree type)
> + {
> +   HOST_WIDE_INT align = tree_low_cst (TYPE_SIZE (type), 0);
> +
> +   if (TARGET_AAPCS_BASED)
> +     align = MIN (align, 64);
> +
> +   return align;
> + }
> +
>   static unsigned int
>   arm_autovectorize_vector_sizes (void)
>   {
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> *** gcc/tree-vect-data-refs.c   (revision 189809)
> --- gcc/tree-vect-data-refs.c   (working copy)
> *************** vect_update_misalignment_for_peel (struc
> *** 1059,1065 ****
>         int misal = DR_MISALIGNMENT (dr);
>         tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>         misal += negative ? -npeel * dr_size : npeel * dr_size;
> !       misal &= GET_MODE_SIZE (TYPE_MODE (vectype)) - 1;
>         SET_DR_MISALIGNMENT (dr, misal);
>         return;
>       }
> --- 1059,1065 ----
>         int misal = DR_MISALIGNMENT (dr);
>         tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>         misal += negative ? -npeel * dr_size : npeel * dr_size;
> !       misal &= (TYPE_ALIGN (vectype) / BITS_PER_UNIT) - 1;
>         SET_DR_MISALIGNMENT (dr, misal);
>         return;
>       }
> Index: gcc/tree-vect-loop-manip.c
> ===================================================================
> *** gcc/tree-vect-loop-manip.c  (revision 189809)
> --- gcc/tree-vect-loop-manip.c  (working copy)
> *************** vect_do_peeling_for_loop_bound (loop_vec
> *** 1937,1943 ****
>      If the misalignment of DR is known at compile time:
>        addr_mis = int mis = DR_MISALIGNMENT (dr);
>      Else, compute address misalignment in bytes:
> !      addr_mis = addr & (vectype_size - 1)
>
>      prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
>
> --- 1937,1943 ----
>      If the misalignment of DR is known at compile time:
>        addr_mis = int mis = DR_MISALIGNMENT (dr);
>      Else, compute address misalignment in bytes:
> !      addr_mis = addr & (vectype_align - 1)
>
>      prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
>
> *************** vect_gen_niters_for_prolog_loop (loop_ve
> *** 1991,1999 ****
>         tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
>                                                 &new_stmts, offset, loop);
>         tree type = unsigned_type_for (TREE_TYPE (start_addr));
> !       tree vectype_size_minus_1 = build_int_cst (type, vectype_align - 1);
> !       tree elem_size_log =
> !         build_int_cst (type, exact_log2 (vectype_align/nelements));
>         tree nelements_minus_1 = build_int_cst (type, nelements - 1);
>         tree nelements_tree = build_int_cst (type, nelements);
>         tree byte_misalign;
> --- 1991,2000 ----
>         tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
>                                                 &new_stmts, offset, loop);
>         tree type = unsigned_type_for (TREE_TYPE (start_addr));
> !       tree vectype_align_minus_1 = build_int_cst (type, vectype_align - 1);
> !       HOST_WIDE_INT elem_size =
> !                 int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
> !       tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
>         tree nelements_minus_1 = build_int_cst (type, nelements - 1);
>         tree nelements_tree = build_int_cst (type, nelements);
>         tree byte_misalign;
> *************** vect_gen_niters_for_prolog_loop (loop_ve
> *** 2002,2011 ****
>         new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
>         gcc_assert (!new_bb);
>
> !       /* Create:  byte_misalign = addr & (vectype_size - 1)  */
>         byte_misalign =
>           fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
> !                      vectype_size_minus_1);
>
>         /* Create:  elem_misalign = byte_misalign / element_size  */
>         elem_misalign =
> --- 2003,2012 ----
>         new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
>         gcc_assert (!new_bb);
>
> !       /* Create:  byte_misalign = addr & (vectype_align - 1)  */
>         byte_misalign =
>           fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
> !                      vectype_align_minus_1);
>
>         /* Create:  elem_misalign = byte_misalign / element_size  */
>         elem_misalign =
> Index: gcc/testsuite/lib/target-supports.exp
> ===================================================================
> *** gcc/testsuite/lib/target-supports.exp       (revision 189809)
> --- gcc/testsuite/lib/target-supports.exp       (working copy)
> *************** proc check_effective_target_natural_alig
> *** 3379,3384 ****
> --- 3379,3404 ----
>       return $et_natural_alignment_64_saved
>   }
>
> + # Return 1 if all vector types are naturally aligned (aligned to their
> + # type-size), 0 otherwise.
> + #
> + # This won't change for different subtargets so cache the result.
> +
> + proc check_effective_target_vect_natural_alignment { } {
> +     global et_vect_natural_alignment
> +
> +     if [info exists et_vect_natural_alignment_saved] {
> +         verbose "check_effective_target_vect_natural_alignment: using cached result" 2
> +     } else {
> +         set et_vect_natural_alignment_saved 1
> +         if { [check_effective_target_arm_eabi] } {
> +             set et_vect_natural_alignment_saved 0
> +         }
> +     }
> +     verbose "check_effective_target_vect_natural_alignment: returning $et_vect_natural_alignment_saved" 2
> +     return $et_vect_natural_alignment_saved
> + }
> +
>   # Return 1 if vector alignment (for types of size 32 bit or less) is reachable, 0 otherwise.
>   #
>   # This won't change for different subtargets so cache the result.
> Index: gcc/testsuite/gcc.dg/align-2.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/align-2.c      (revision 189809)
> --- gcc/testsuite/gcc.dg/align-2.c      (working copy)
> ***************
> *** 1,5 ****
>   /* PR 17962 */
> ! /* { dg-do compile } */
>   /* { dg-options "" } */
>
>   typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
> --- 1,5 ----
>   /* PR 17962 */
> ! /* { dg-do compile { target vect_natural_alignment } } */
>   /* { dg-options "" } */
>
>   typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
> Index: gcc/testsuite/gcc.dg/vect/slp-25.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/vect/slp-25.c  (revision 189809)
> --- gcc/testsuite/gcc.dg/vect/slp-25.c  (working copy)
> *************** int main (void)
> *** 56,60 ****
>
>   /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
>   /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
> ! /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
>   /* { dg-final { cleanup-tree-dump "vect" } } */
> --- 56,60 ----
>
>   /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
>   /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
> ! /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || { ! vect_natural_alignment } } } } } */
>   /* { dg-final { cleanup-tree-dump "vect" } } */
>
> --
>   Dr. Ulrich Weigand
>   GNU Toolchain for Linux on System z and Cell BE
>   Ulrich.Weigand@de.ibm.com
>



More information about the Gcc-patches mailing list