[Bug target/108583] [13 Regression] wrong code with vector division by uint16 at -O2

rguenther at suse dot de gcc-bugzilla@gcc.gnu.org
Tue Jan 31 14:45:26 GMT 2023


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583

--- Comment #17 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 31 Jan 2023, tnfchris at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583
> 
> --- Comment #15 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
> > OK, hopefully I understand now.  Sorry for being slow.
> 
> Not at all, Sorry if it came across a bit cranky, it wasn't meant that way!
> 
> > If that's the condition we want to test for, it seems like something
> > we need to check in the vectoriser rather than the hook.  And it's
> > not something we can easily do in the vector form, since we don't
> > track ranges for vectors (AFAIK).
> 
> Ack, that also tracks with what I tried before, we don't indeed track ranges
> for vector ops. The general case can still be handled slightly better (I think)
> but it doesn't become as clear of a win as this one.
> 
> > You probably did so elsewhere some time ago, but what exactly are those
> > four instructions?  (pointers to specifications appreciated)
> 
> For NEON we use:
> https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/ADDHN--ADDHN2--Add-returning-High-Narrow-

so thats a add + pack high

> https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/UADDW--UADDW2--Unsigned-Add-Wide-

and that unpacks (zero-extends) the high/low part of one operand of an add

I wonder if we'd open-code the pack / unpack and use regular add whether
combine can synthesize uaddw and addhn?  The pack and unpack would be
vec_perms on GIMPLE (plus V_C_E).

> In that order, and for SVE we use two
> https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/ADDHNB--Add-narrow-high-part--bottom--

probably similar.

So the difficulty here will be to decide whether that's in the end
better than what the pattern handling code does now, right?  Because
I think most targets will be able to do the above but lacking the
special adds it will be slower because of the extra packing/unpacking?

That said, can we possibly do just that costing (would be a first in
the pattern code I guess) with a target hook?  Or add optabs for
the addh operations so we can query support?


More information about the Gcc-bugs mailing list