This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/52252] An opportunity for x86 gcc vectorizer (gain up to 3 times)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52252

--- Comment #2 from Stupachenko Evgeny <evstupac at gmail dot com> 2012-02-29 12:32:20 UTC ---
The difference of 2 dumps from

Arm: gcc -O3 -mfpu=neon test.c -S -ftree-vectorizer-verbose=12
X86: gcc -O3 -m32 -msse3 test.c -S -ftree-vectorizer-verbose=12

Starts at:

For Arm (can use vec_load_lanes):

6: === vect_make_slp_decision === 
6: === vect_detect_hybrid_slp ===
6: === vect_analyze_loop_operations ===
6: examining phi: in_35 = PHI <in_22(7), in_5(D)(4)>

ââ

6: can use vec_load_lanes<CI><V16QI> 
6: vect_model_load_cost: unaligned supported by hardware. 
6: vect_model_load_cost: inside_cost = 2, outside_cost = 0 .

For x86 (no array mode for V16QI[3]):

6: === vect_make_slp_decision === 
6: === vect_detect_hybrid_slp === 
6: === vect_analyze_loop_operations === 
6: examining phi: in_35 = PHI <in_22(7), in_5(D)(4)> 

.ââ

6: no array mode for V16QI[3] 
6: the size of the group of strided accesses is not a power of 2 
6: not vectorized: relevant stmt not supported: r_8 = *in_35; 

As I mentioned before, there is an ability for x86 to handle this (Arm can
shuffle than loads, x86 can use pshufb).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]