[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

rth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Jun 12 18:55:00 GMT 2012


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #12 from Richard Henderson <rth at gcc dot gnu.org> 2012-06-12 18:54:24 UTC ---
(In reply to comment #10)
> But maybe allowing const_vector in (some of) the define_insn_and_split would
> be the way to go ...

Maybe.  It certainly would ease some of the simplifications.
At the moment I don't think we can go from

  mem -> const -> simplify -> const ->newmem

On the other hand, for this particular test case, where all
of the vector_cst elements are the same, and a reasonably
small number of bits set, it would be great to be able to
leverage synth_mult.

The main complexity for sse2_mulv4si3 is due to the fact that
we have to decompose the operation into V8HImode multiplies.
Whereas if we decompose the multiply, we have the shifts and
adds in V4SImode.



More information about the Gcc-bugs mailing list