This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[autovect] [patch] improve realignment in outer-loop vectorization


Currently outer-loop vectorization does not support the optimized
realignment scheme. Optimized realignment means that instead of generating
two vector loads + a shuffle in each iteration, we basically do predictive
commoning and only do one extra vector load before the loop, and only one
vector load (+ shuffle) in each iteration, and we reuse the result of the
vector load from the previous iteration. When the realignment is done for
loads inside the inner-loop nested in the outer-loop, we need to be careful
to check that the misalignment remains fixed throughout the execution of
the inner-loop, and we also need to be careful about where to do the extra
vector load - before the inner-loop (but inside the outer-loop), or outside
the outer-loop (depends on whether the memory access has evolution in the
outer-loop). This patch adds this support.

Bootstrapped with vectorization enabled, and tested on the vectorizer
testcases, on powerpc-linux and i386-linux. Committed to autovect branch.

dorit

        * tree-vectorizer.c (vect_supportable_dr_alignment): Allow using
the
        optimized realignment scheme for outer-loop vectorization. Add
        documentation.
        * tree-vect-transform.c (vect_create_data_ref_ptr): Replace the
unused
        BSI function argument with a new function argument - at_loop.
        Simplified the condition that determines STEP. Allow generating the
        optimized realignment scheme for outer-loop vectorization
(only_init
        can now be TRUE also when vectorizing outer-loops).
        (vect_create_addr_base_for_vector_ref): The function argument
in_loop
        renamed to loop and is not optional anymore (for better
readability).
        Updated the function documentation accordingly. Fixed the
computation
        of the step.
        (vectorizable_store): Call vect_create_data_ref_ptr with loop
instead
        of bsi.
        (vect_setup_realignment): Returns another value - at_loop. Allow
        generating the optimized realignment scheme for outer-loop
        vectorization. Added documentation.
        (vectorizable_load): Allow using the optimized realignment scheme
for
        outer-loop vectorization.  Call vect_setup_realignment with
additional
        argument at_loop.  Call vect_create_data_ref_ptr with at_loop
instead
        of bsi.  Fix 80-column overflow.  Rename PHI_STMT to PHI.
        (vect_gen_niters_for_prolog_loop): Call
        vect_create_addr_base_for_vector_ref with additional argument loop.
        (vect_create_cond_for_align_checks): Likewise.

(See attached file: misalignment_fix.july10)

Attachment: misalignment_fix.july10
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]