[Bug target/92280] [10 regression] gcc.target/i386/pr83008.c FAILs

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Nov 5 10:21:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92280

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Richard Biener from comment #3)
> > That said, VN already computes the partial loads to { 148, _142, _145, _139 }
> > and would insert those CTORs in place of the loads, making the stores and
> > the AVX512 CTOR dead.  But that's obviously only profitable if the stores
> > and the CTOR end up being dead, otherwise we risk doing redundant
> > vector construction where cheap loads from memory would be possible.
> > The alternative way expressing it via sub-vector extraction is similarly
> > on the boundary of profitable plus we're happily simplifying that to a
> > redundant CTOR.
> 
> What about a rtl version pass_fre, after pass_expand it can be more certain
> to eliminate partial reloads.

Not sure what you are after - combine elides the loads as well but 
nothing on RTL then removes the dead store.

There's no classical pass doing "CSE if this stmt becomes dead" which
is what would be needed for optimality.

There would be sth like SRA analyzing accesses (here to 'tmp') which could
be used to either transform the AVX512 CTOR to { v1, v2, v3, v4 } with
v1 = { _124, _143, _1245, _234 }, etc. (which incidentially is how we end
up constructing such vector) or to split the store.

Anyway, I am testing a patch.


More information about the Gcc-bugs mailing list