This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Currently outer-loop vectorization does not support the optimized realignment scheme. Optimized realignment means that instead of generating two vector loads + a shuffle in each iteration, we basically do predictive commoning and only do one extra vector load before the loop, and only one vector load (+ shuffle) in each iteration, and we reuse the result of the vector load from the previous iteration. When the realignment is done for loads inside the inner-loop nested in the outer-loop, we need to be careful to check that the misalignment remains fixed throughout the execution of the inner-loop, and we also need to be careful about where to do the extra vector load - before the inner-loop (but inside the outer-loop), or outside the outer-loop (depends on whether the memory access has evolution in the outer-loop). This patch adds this support. Bootstrapped with vectorization enabled, and tested on the vectorizer testcases, on powerpc-linux and i386-linux. Committed to autovect branch. dorit * tree-vectorizer.c (vect_supportable_dr_alignment): Allow using the optimized realignment scheme for outer-loop vectorization. Add documentation. * tree-vect-transform.c (vect_create_data_ref_ptr): Replace the unused BSI function argument with a new function argument - at_loop. Simplified the condition that determines STEP. Allow generating the optimized realignment scheme for outer-loop vectorization (only_init can now be TRUE also when vectorizing outer-loops). (vect_create_addr_base_for_vector_ref): The function argument in_loop renamed to loop and is not optional anymore (for better readability). Updated the function documentation accordingly. Fixed the computation of the step. (vectorizable_store): Call vect_create_data_ref_ptr with loop instead of bsi. (vect_setup_realignment): Returns another value - at_loop. Allow generating the optimized realignment scheme for outer-loop vectorization. Added documentation. (vectorizable_load): Allow using the optimized realignment scheme for outer-loop vectorization. Call vect_setup_realignment with additional argument at_loop. Call vect_create_data_ref_ptr with at_loop instead of bsi. Fix 80-column overflow. Rename PHI_STMT to PHI. (vect_gen_niters_for_prolog_loop): Call vect_create_addr_base_for_vector_ref with additional argument loop. (vect_create_cond_for_align_checks): Likewise. (See attached file: misalignment_fix.july10)
Attachment:
misalignment_fix.july10
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |