This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

--- Comment #13 from Jan Hubicka <hubicka at ucw dot cz> ---
> So is this option still helping with the latest microcode? Not in this case at
> least.

It is on my TODO list to re-benchmark 256bit vectorization for Zen.  I do not
think microcode is a big difference here.  Using 256 bit vectors has advantage
of exposing more of parallelism but also disadvantage of requiring more
involved setup.  So for loops that vectorize naturally (like matrix
multiplication) it can be win, while for loops that are difficult to vectorize
it is a loss. So I think the early benchmarks did not look consistent and it is
why 128bit mode was introduced.

It is not that different form vectorizing for K8 which had split SSE registers
in a similar fashion or for kabylake which splits 512 bit operations.

While rewriting the cost-model I tried to keep this in mind and more acurately
model the split operations, so it may be possible to switch to 256 by default.

Ideally vectorizer should make a deicsion whether 128 or 256 is win for
partiuclar loop but it doesn't seem to have infrastructure to do so.
My plan is to split current flag into two - preffer 128bit and assume
that registers are internally split and see if that is enough to get consistent
win for 256 bit vectorization.

Richi may know better.

Honza

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]