[PATCH v2] Target-specific limits on vector alignment
Richard Guenther
richard.guenther@gmail.com
Mon Jul 30 11:12:00 GMT 2012
On Fri, Jul 27, 2012 at 5:24 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Richard Guenther wrote:
>> On Mon, Jun 11, 2012 at 5:25 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>> > On 11/06/12 15:53, Richard Guenther wrote:
>> >> The type argument or the size argument looks redundant.
>> >
>> > Technically, yes, we could get rid of "tree_low_cst (TYPE_SIZE (type)"
>> > and calculate it inside the alignment function if it was needed.
>> > However, it seemed likely that most targets would need that number one
>> > way or another, such that passing it would be helpful.
>>
>> Well, you don't need it in stor-layout and targets might think the value
>> may be completely unrelated to the type ...
>>
>> >> Note that we still can have such vector "properly" aligned, thus the
>> >> vectorizer would need to use build_aligned_type also if it knows the
>> >> type is aligned, not only when thinks it is misaligned. You basically
>> >> change the alignment of the "default" vector type.
>> >
>> > I'm not sure I follow...
>>
>> I say that a large vector may be still aligned, so the vectorizer when
>> creating vector memory references has to use a non-default aligned vector
>> type when the vector is aligned. It won't do that at the moment.
>
> Richard (Earnshaw) has asked me to take over working on this patch now.
>
> I've now made the change requested above and removed the size argument.
> The target is now simply asked to return the required alignment for the
> given vector type. I've also added a check for the case where the
> target provides both an alignment and a mode for a vector type, but
> the mode actually requires bigger alignment than the type. This is
> simply rejected (the target can fix this by reporting a different
> type alignment or changing the mode alignment).
>
> I've not made any attempts to have the vectorizer register larger
> alignments than the one returned by the target hook. It's not
> clear to me when this would be useful (at least on ARM) ...
>
> I've also run the testsuite, and this actually uncovered to bugs in
> the vectorizer where it made an implicit assumption that vector types
> must always be naturally aligned:
>
> - In vect_update_misalignment_for_peel, the code used the vector size
> instead of the required alignment in order to bound misalignment
> values -- leading to a misalignment value bigger than the underlying
> alignment requirement of the vector type, causing an ICE later on
>
> - In vect_do_peeling_for_loop_bound, the code divided the vector type
> alignment by the number of elements in order to arrive at the element
> size ... this returns a wrong value if the alignment is less than the
> vector size, causing incorrect code to be generated
>
> (This routine also had some confusion between size and alignment in
> comments and variable names, which I've fixed as well.)
>
> Finally, two test cases still failed spuriously:
>
> - gcc.dg/align-2.c actually checked that vector types are naturally
> aligned
>
> - gcc.dg/vect/slp-25.c checked that we needed to perform peeling for
> alignment, which we actually don't need any more if vector types
> have a lesser alignment requirement in the first place
>
> I've added a new effective target flag to check whether the target
> requires natural alignment for vector types, and disabled those two
> tests if it doesn't.
>
> With those changes, I've completed testing with no regressions on
> arm-linux-gnueabi.
>
> OK for mainline?
Ok. Please add to the documentation that the default vector alignment
has to be a power-of-two multiple of the default vector element alignment.
You probably want to double-check vector_alignment_reachable_p as well
which checks whether vector alignment can be reached by peeling off
scalar iterations.
Thanks,
Richard.
> Bye,
> Ulrich
>
>
> ChangeLog:
>
> * target.def (vector_alignment): New target hook.
> * doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
> * doc/tm.texi: Regenerate.
> * targhooks.c (default_vector_alignment): New function.
> * targhooks.h (default_vector_alignment): Add prototype.
> * stor-layout.c (layout_type): Use targetm.vector_alignment.
> * config/arm/arm.c (arm_vector_alignment): New function.
> (TARGET_VECTOR_ALIGNMENT): Define.
>
> * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use
> vector type alignment instead of size.
> * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Use
> element type size directly instead of computing it from alignment.
> Fix variable naming and comment.
>
> testsuite/ChangeLog:
>
> * lib/target-supports.exp
> (check_effective_target_vect_natural_alignment): New function.
> * gcc.dg/align-2.c: Only run on targets with natural alignment
> of vector types.
> * gcc.dg/vect/slp-25.c: Adjust tests for targets without natural
> alignment of vector types.
>
>
> Index: gcc/target.def
> ===================================================================
> *** gcc/target.def (revision 189809)
> --- gcc/target.def (working copy)
> *************** DEFHOOK
> *** 1659,1664 ****
> --- 1659,1672 ----
> bool, (enum machine_mode mode),
> hook_bool_mode_false)
>
> + DEFHOOK
> + (vector_alignment,
> + "This hook can be used to define the alignment for a vector of type\n\
> + @var{type}, in order to comply with a platform ABI. The default is to\n\
> + require natural alignment for vector types.",
> + HOST_WIDE_INT, (const_tree type),
> + default_vector_alignment)
> +
> /* True if we should try to use a scalar mode to represent an array,
> overriding the usual MAX_FIXED_MODE limit. */
> DEFHOOK
> Index: gcc/doc/tm.texi
> ===================================================================
> *** gcc/doc/tm.texi (revision 189809)
> --- gcc/doc/tm.texi (working copy)
> *************** make it all fit in fewer cache lines.
> *** 1107,1112 ****
> --- 1107,1118 ----
> If the value of this macro has a type, it should be an unsigned type.
> @end defmac
>
> + @deftypefn {Target Hook} HOST_WIDE_INT TARGET_VECTOR_ALIGNMENT (const_tree @var{type})
> + This hook can be used to define the alignment for a vector of type
> + @var{type}, in order to comply with a platform ABI. The default is to
> + require natural alignment for vector types.
> + @end deftypefn
> +
> @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
> If defined, a C expression to compute the alignment for stack slot.
> @var{type} is the data type, @var{mode} is the widest mode available,
> Index: gcc/doc/tm.texi.in
> ===================================================================
> *** gcc/doc/tm.texi.in (revision 189809)
> --- gcc/doc/tm.texi.in (working copy)
> *************** make it all fit in fewer cache lines.
> *** 1091,1096 ****
> --- 1091,1098 ----
> If the value of this macro has a type, it should be an unsigned type.
> @end defmac
>
> + @hook TARGET_VECTOR_ALIGNMENT
> +
> @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
> If defined, a C expression to compute the alignment for stack slot.
> @var{type} is the data type, @var{mode} is the widest mode available,
> Index: gcc/targhooks.c
> ===================================================================
> *** gcc/targhooks.c (revision 189809)
> --- gcc/targhooks.c (working copy)
> *************** tree default_mangle_decl_assembler_name
> *** 945,950 ****
> --- 945,957 ----
> return id;
> }
>
> + /* Default to natural alignment for vector types. */
> + HOST_WIDE_INT
> + default_vector_alignment (const_tree type)
> + {
> + return tree_low_cst (TYPE_SIZE (type), 0);
> + }
> +
> bool
> default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
> {
> Index: gcc/targhooks.h
> ===================================================================
> *** gcc/targhooks.h (revision 189809)
> --- gcc/targhooks.h (working copy)
> *************** extern int default_builtin_vectorization
> *** 83,88 ****
> --- 83,90 ----
>
> extern tree default_builtin_reciprocal (unsigned int, bool, bool);
>
> + extern HOST_WIDE_INT default_vector_alignment (const_tree);
> +
> extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
> extern bool
> default_builtin_support_vector_misalignment (enum machine_mode mode,
> Index: gcc/stor-layout.c
> ===================================================================
> *** gcc/stor-layout.c (revision 189809)
> --- gcc/stor-layout.c (working copy)
> *************** layout_type (tree type)
> *** 2131,2139 ****
> TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
> bitsize_int (nunits));
>
> ! /* Always naturally align vectors. This prevents ABI changes
> ! depending on whether or not native vector modes are supported. */
> ! TYPE_ALIGN (type) = tree_low_cst (TYPE_SIZE (type), 0);
> break;
> }
>
> --- 2131,2147 ----
> TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
> bitsize_int (nunits));
>
> ! /* For vector types, we do not default to the mode's alignment.
> ! Instead, query a target hook, defaulting to natural alignment.
> ! This prevents ABI changes depending on whether or not native
> ! vector modes are supported. */
> ! TYPE_ALIGN (type) = targetm.vector_alignment (type);
> !
> ! /* However, if the underlying mode requires a bigger alignment than
> ! what the target hook provides, we cannot use the mode. For now,
> ! simply reject that case. */
> ! gcc_assert (TYPE_ALIGN (type)
> ! >= GET_MODE_ALIGNMENT (TYPE_MODE (type)));
> break;
> }
>
> Index: gcc/config/arm/arm.c
> ===================================================================
> *** gcc/config/arm/arm.c (revision 189809)
> --- gcc/config/arm/arm.c (working copy)
> *************** static bool arm_array_mode_supported_p (
> *** 254,259 ****
> --- 254,260 ----
> unsigned HOST_WIDE_INT);
> static enum machine_mode arm_preferred_simd_mode (enum machine_mode);
> static bool arm_class_likely_spilled_p (reg_class_t);
> + static HOST_WIDE_INT arm_vector_alignment (const_tree type);
> static bool arm_vector_alignment_reachable (const_tree type, bool is_packed);
> static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
> const_tree type,
> *************** static const struct attribute_spec arm_a
> *** 602,607 ****
> --- 603,611 ----
> #undef TARGET_CLASS_LIKELY_SPILLED_P
> #define TARGET_CLASS_LIKELY_SPILLED_P arm_class_likely_spilled_p
>
> + #undef TARGET_VECTOR_ALIGNMENT
> + #define TARGET_VECTOR_ALIGNMENT arm_vector_alignment
> +
> #undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
> #define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE \
> arm_vector_alignment_reachable
> *************** arm_have_conditional_execution (void)
> *** 25051,25056 ****
> --- 25055,25072 ----
> return !TARGET_THUMB1;
> }
>
> + /* The AAPCS sets the maximum alignment of a vector to 64 bits. */
> + static HOST_WIDE_INT
> + arm_vector_alignment (const_tree type)
> + {
> + HOST_WIDE_INT align = tree_low_cst (TYPE_SIZE (type), 0);
> +
> + if (TARGET_AAPCS_BASED)
> + align = MIN (align, 64);
> +
> + return align;
> + }
> +
> static unsigned int
> arm_autovectorize_vector_sizes (void)
> {
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> *** gcc/tree-vect-data-refs.c (revision 189809)
> --- gcc/tree-vect-data-refs.c (working copy)
> *************** vect_update_misalignment_for_peel (struc
> *** 1059,1065 ****
> int misal = DR_MISALIGNMENT (dr);
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> misal += negative ? -npeel * dr_size : npeel * dr_size;
> ! misal &= GET_MODE_SIZE (TYPE_MODE (vectype)) - 1;
> SET_DR_MISALIGNMENT (dr, misal);
> return;
> }
> --- 1059,1065 ----
> int misal = DR_MISALIGNMENT (dr);
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> misal += negative ? -npeel * dr_size : npeel * dr_size;
> ! misal &= (TYPE_ALIGN (vectype) / BITS_PER_UNIT) - 1;
> SET_DR_MISALIGNMENT (dr, misal);
> return;
> }
> Index: gcc/tree-vect-loop-manip.c
> ===================================================================
> *** gcc/tree-vect-loop-manip.c (revision 189809)
> --- gcc/tree-vect-loop-manip.c (working copy)
> *************** vect_do_peeling_for_loop_bound (loop_vec
> *** 1937,1943 ****
> If the misalignment of DR is known at compile time:
> addr_mis = int mis = DR_MISALIGNMENT (dr);
> Else, compute address misalignment in bytes:
> ! addr_mis = addr & (vectype_size - 1)
>
> prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
>
> --- 1937,1943 ----
> If the misalignment of DR is known at compile time:
> addr_mis = int mis = DR_MISALIGNMENT (dr);
> Else, compute address misalignment in bytes:
> ! addr_mis = addr & (vectype_align - 1)
>
> prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
>
> *************** vect_gen_niters_for_prolog_loop (loop_ve
> *** 1991,1999 ****
> tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
> &new_stmts, offset, loop);
> tree type = unsigned_type_for (TREE_TYPE (start_addr));
> ! tree vectype_size_minus_1 = build_int_cst (type, vectype_align - 1);
> ! tree elem_size_log =
> ! build_int_cst (type, exact_log2 (vectype_align/nelements));
> tree nelements_minus_1 = build_int_cst (type, nelements - 1);
> tree nelements_tree = build_int_cst (type, nelements);
> tree byte_misalign;
> --- 1991,2000 ----
> tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
> &new_stmts, offset, loop);
> tree type = unsigned_type_for (TREE_TYPE (start_addr));
> ! tree vectype_align_minus_1 = build_int_cst (type, vectype_align - 1);
> ! HOST_WIDE_INT elem_size =
> ! int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
> ! tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
> tree nelements_minus_1 = build_int_cst (type, nelements - 1);
> tree nelements_tree = build_int_cst (type, nelements);
> tree byte_misalign;
> *************** vect_gen_niters_for_prolog_loop (loop_ve
> *** 2002,2011 ****
> new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
> gcc_assert (!new_bb);
>
> ! /* Create: byte_misalign = addr & (vectype_size - 1) */
> byte_misalign =
> fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
> ! vectype_size_minus_1);
>
> /* Create: elem_misalign = byte_misalign / element_size */
> elem_misalign =
> --- 2003,2012 ----
> new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
> gcc_assert (!new_bb);
>
> ! /* Create: byte_misalign = addr & (vectype_align - 1) */
> byte_misalign =
> fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
> ! vectype_align_minus_1);
>
> /* Create: elem_misalign = byte_misalign / element_size */
> elem_misalign =
> Index: gcc/testsuite/lib/target-supports.exp
> ===================================================================
> *** gcc/testsuite/lib/target-supports.exp (revision 189809)
> --- gcc/testsuite/lib/target-supports.exp (working copy)
> *************** proc check_effective_target_natural_alig
> *** 3379,3384 ****
> --- 3379,3404 ----
> return $et_natural_alignment_64_saved
> }
>
> + # Return 1 if all vector types are naturally aligned (aligned to their
> + # type-size), 0 otherwise.
> + #
> + # This won't change for different subtargets so cache the result.
> +
> + proc check_effective_target_vect_natural_alignment { } {
> + global et_vect_natural_alignment
> +
> + if [info exists et_vect_natural_alignment_saved] {
> + verbose "check_effective_target_vect_natural_alignment: using cached result" 2
> + } else {
> + set et_vect_natural_alignment_saved 1
> + if { [check_effective_target_arm_eabi] } {
> + set et_vect_natural_alignment_saved 0
> + }
> + }
> + verbose "check_effective_target_vect_natural_alignment: returning $et_vect_natural_alignment_saved" 2
> + return $et_vect_natural_alignment_saved
> + }
> +
> # Return 1 if vector alignment (for types of size 32 bit or less) is reachable, 0 otherwise.
> #
> # This won't change for different subtargets so cache the result.
> Index: gcc/testsuite/gcc.dg/align-2.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/align-2.c (revision 189809)
> --- gcc/testsuite/gcc.dg/align-2.c (working copy)
> ***************
> *** 1,5 ****
> /* PR 17962 */
> ! /* { dg-do compile } */
> /* { dg-options "" } */
>
> typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
> --- 1,5 ----
> /* PR 17962 */
> ! /* { dg-do compile { target vect_natural_alignment } } */
> /* { dg-options "" } */
>
> typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
> Index: gcc/testsuite/gcc.dg/vect/slp-25.c
> ===================================================================
> *** gcc/testsuite/gcc.dg/vect/slp-25.c (revision 189809)
> --- gcc/testsuite/gcc.dg/vect/slp-25.c (working copy)
> *************** int main (void)
> *** 56,60 ****
>
> /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
> ! /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
> /* { dg-final { cleanup-tree-dump "vect" } } */
> --- 56,60 ----
>
> /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
> ! /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || { ! vect_natural_alignment } } } } } */
> /* { dg-final { cleanup-tree-dump "vect" } } */
>
> --
> Dr. Ulrich Weigand
> GNU Toolchain for Linux on System z and Cell BE
> Ulrich.Weigand@de.ibm.com
>
More information about the Gcc-patches
mailing list