[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

linkw at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Fri May 28 07:29:44 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794

--- Comment #2 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)

Thanks for the comments!

> There's predictive commoning which can do similar transforms and runs after
> vectorization.  It might be it doesn't handle these "simple" cases or that
> loop dependence info is not up to the task there.
> 

pcom does fix this problem, but it's enabled by default at -O3. Could it be
considered to be run at O2? Or enabled at O2 at some conditions such as: only
for one loop which skips loop carried optimization and isn't vectorized
further?

> Another option is to avoid the PRE guard with the (very) cheap cost model
> at the expense of not vectorizing affected loops.
> 

OK, I will benchmark this to see its impact. For the particular loops in
554.roms_r, they can be vectorized at cheap cost model, this bmk got improved
at cheap cost model on both Power8 and Power9 (a bit though). So I will just
test the impact on very cheap cost model.


More information about the Gcc-bugs mailing list