Bug 109964 - auto-vectorization of shift ignores integral promotions
Summary: auto-vectorization of shift ignores integral promotions
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 13.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2023-05-25 10:49 UTC by Matthias Kretz (Vir)
Modified: 2024-04-15 13:42 UTC (History)
3 users (show)

See Also:
Host:
Target: powerpc64le-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-05-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Kretz (Vir) 2023-05-25 10:49:49 UTC
Test cases:
short: https://godbolt.org/z/oM56Pd19o
char: https://godbolt.org/z/f5Tc6TMEd

The POWER vector shifts read only 3/4 bits from the rhs vector. Thus, shifts that are valid on int (i.e. >= 8 or >= 16) are translated incorrectly when using the vector shift instructions. If the rhs values are not known to be bounded a fix-up is necessary.
Comment 1 Richard Biener 2023-05-25 11:15:42 UTC
using V [[gnu::vector_size(16)]] = short;

V f(V s)
{
  return V{
    short(1) >> s[0],
    short(1) >> s[1],
    short(1) >> s[2],
    short(1) >> s[3],
    short(1) >> s[4],
    short(1) >> s[5],
    short(1) >> s[6],
    short(1) >> s[7]
    };
}

the frontend gives us

return <retval> = {(short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[0]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[1]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[2]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[3]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[4]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[5]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[6]), (short int) (1 >> (int) VIEW_CONVERT_EXPR<short int[8]>(s)[7])};

and the promotion/demotion stays until eventually basic-block vectorization
vectorizes this without doing the promotion.

t.ii:14:5: note:   vect_recog_over_widening_pattern: detected: _3 = 1 >> _2;
t.ii:14:5: note:   demoting int to signed short
t.ii:14:5: note:   created pattern stmt: patt_36 = 1 >> _1;

I don't think there's any way to query the target about whether vector
shifts "behave".

We end up with

V f (V s)
{
  vector(8) signed short vect_patt_36.3;

  <bb 2> [local count: 1073741824]:
  vect_patt_36.3_62 = { 1, 1, 1, 1, 1, 1, 1, 1 } >> s_33(D);
  return vect_patt_36.3_62;
Comment 2 Richard Biener 2023-05-25 11:17:49 UTC
The simplest thing would be to make this a target bug for their vec_shr expander not fulfilling the (undocumented) semantics.
Comment 3 Richard Biener 2023-05-25 11:45:51 UTC
So the bug in the vectorizer is that it does

t.ii:14:5: note:   can narrow to signed:16 without loss of precision: _31 = 1 >> _30;
t.ii:14:5: note:   only the low 16 bits of _30 are significant

while that's correct that's not the proper constraint to check for in
vect_recog_over_widening_pattern I think.  That has code to deal with
overflow for plus/minus/mult but no defense against shifts
(that's also not a vect_truncatable_operation_p, so maybe it should simply
check that)
Comment 4 Richard Sandiford 2023-05-25 13:01:40 UTC
(In reply to Richard Biener from comment #3)
> So the bug in the vectorizer is that it does
> 
> t.ii:14:5: note:   can narrow to signed:16 without loss of precision: _31 =
> 1 >> _30;
> t.ii:14:5: note:   only the low 16 bits of _30 are significant
> 
> while that's correct that's not the proper constraint to check for in
> vect_recog_over_widening_pattern I think.  That has code to deal with
> overflow for plus/minus/mult but no defense against shifts
> (that's also not a vect_truncatable_operation_p, so maybe it should simply
> check that)
Like you say, vect_truncatable_operation_p already checks for operations
for which truncation is distributive.  But the function is supposed
to handle other cases too.  E.g. I think the current code is correct
for division.

But yeah, right shifts are an awkward case, because the defined
range of the second operand is much smaller than the range of the
type, and yet the defined range depends on the range of the type.
Sorry for not thinking about that at the time.
Comment 5 Richard Biener 2023-07-28 12:05:05 UTC
This now hits PR110838.
Comment 6 Richard Biener 2024-04-15 13:28:16 UTC
I think this has been fixed now?
Comment 7 Matthias Kretz (Vir) 2024-04-15 13:40:49 UTC
looks good to me
Comment 8 Richard Biener 2024-04-15 13:42:01 UTC
Fixed.