[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark
rth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Jun 12 18:55:00 GMT 2012
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533
--- Comment #12 from Richard Henderson <rth at gcc dot gnu.org> 2012-06-12 18:54:24 UTC ---
(In reply to comment #10)
> But maybe allowing const_vector in (some of) the define_insn_and_split would
> be the way to go ...
Maybe. It certainly would ease some of the simplifications.
At the moment I don't think we can go from
mem -> const -> simplify -> const ->newmem
On the other hand, for this particular test case, where all
of the vector_cst elements are the same, and a reasonably
small number of bits set, it would be great to be able to
leverage synth_mult.
The main complexity for sse2_mulv4si3 is due to the fact that
we have to decompose the operation into V8HImode multiplies.
Whereas if we decompose the multiply, we have the shifts and
adds in V4SImode.
More information about the Gcc-bugs
mailing list