[aarch64][vect] Support V8QI->V8HI WIDEN_ patterns
Joel Hutton
Joel.Hutton@arm.com
Wed Feb 10 14:42:05 GMT 2021
Thanks for the quick review.
Updated patch attached. I've addressed your comments below.
Tests are still running, OK for trunk assuming tests come out clean?
[aarch64][vect] Support V8QI->V8HI WIDEN_ patterns
In the case where 8 out of every 16 elements are widened using a
widening pattern and the next 8 are skipped the patterns are not
recognized. This is because they are normally used in a pair, such as
VEC_WIDEN_MINUS_HI/LO, to achieve a v16qi->v16hi conversion for example.
This patch adds support for V8QI->V8HI patterns.
gcc/ChangeLog:
PR tree-optimisation/98772
* optabs-tree.c (supportable_half_widening_operation): New function
to check for supportable V8QI->V8HI widening patterns.
* optabs-tree.h (supportable_half_widening_operation): New function.
* tree-vect-stmts.c (vect_create_vectorized_promotion_stmts): New
function to create promotion stmts for V8QI->V8HI widening patterns.
(vectorizable_conversion): Add case for V8QI->V8HI.
gcc/testsuite/ChangeLog:
PR tree-optimisation/98772
* gcc.target/aarch64/pr98772.c: New test.
>> + /* The case where a widening operation is not making use of the full width of
>> + of the input vector, but using the full width of the output vector.
>> + Return the non-wided code, which will be used after the inputs are
>
>non-widened
Done.
>> + converted to the wide type. */
>> + if ((code == WIDEN_MINUS_EXPR
>> + || code == WIDEN_PLUS_EXPR
>> + || code == WIDEN_MULT_EXPR
>> + || code == WIDEN_LSHIFT_EXPR)
>
>Minor formatting nit, but the ||s should be indented one space more.
Done.
>> + && known_eq (TYPE_VECTOR_SUBPARTS (vectype_in),
>> + TYPE_VECTOR_SUBPARTS (vectype_out)))
>> + {
>> + switch (code)
>> + {
>> + case WIDEN_LSHIFT_EXPR:
>> + *code1 = LSHIFT_EXPR;
>> + return true;
>> + break;
>> + case WIDEN_MINUS_EXPR:
>> + *code1 = MINUS_EXPR;
>> + return true;
>> + break;
>> + case WIDEN_PLUS_EXPR:
>> + *code1 = PLUS_EXPR;
>> + return true;
>> + break;
>> + case WIDEN_MULT_EXPR:
>> + *code1 = MULT_EXPR;
>> + return true;
>> + break;
>> + default:
>> + gcc_unreachable ();
>> + }
>> + }
>
>Rather than return true, I think we should do something like:
>
> if (!supportable_convert_operation (NOP_EXPR, vectype_out,
> vectype_in, &dummy_code))
> return false;
>
> optab = optab_for_tree_code (*code1, vectype_out, optab_default);
> return (optab_handler (optab, TYPE_MODE (vectype_out))
> != CODE_FOR_nothing);
>
>to make sure that the target really does support this.
Done. I used 'optab_vector' not 'optab_default', as I thought that was correct for this case and otherwise 'optab_for_tree_code' fails an assertion when 'LSHIFT_EXPR' is used.
> AFAICT the caller always knows when it wants the “if” statement above
> to be used. What it's doing is a bit different from what
> supportable_convert_operation normally does, so it might be better
> to put it into a separate function that tests whether the target
> supports the non-widening form of a widening operation.
Done.
>> +
>> + vec_tmp.create (vec_oprnds0->length () * 2);
>
>It looks like the * 2 isn't needed.
Done.
>> + if (is_gimple_call (new_stmt3))
>> + {
>> + new_tmp = gimple_call_lhs (new_stmt3);
>> + }
>> + else
>> + {
>> + new_tmp = gimple_assign_lhs (new_stmt3);
>> + }
>
>The lhs is always new_tmp3, so it's not necessary to read it back.
Done.
>> +
>> + /* Store the results for the next step. */
>> + vec_tmp.quick_push (new_tmp);
>
>FWIW, you could just assign to vec_oprnds[i] and not have vec_tmp,
>but I don't know whether that's more or less confusing. Either way's
>fine with me.
I chose to keep vec_tmp, but I don't feel strongly about it.
>> + }
>> +
>> + vec_oprnds0->release ();
>> + *vec_oprnds0 = vec_tmp;
>> +}
>> +
>>
>> /* Check if STMT_INFO performs a conversion operation that can be vectorized.
>> If VEC_STMT is also passed, vectorize STMT_INFO: create a vectorized
>> @@ -4697,7 +4763,13 @@ vectorizable_conversion (vec_info *vinfo,
>> nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
>> nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
>> if (known_eq (nunits_out, nunits_in))
>> - modifier = NONE;
>> + if (code == WIDEN_MINUS_EXPR
>> + || code == WIDEN_PLUS_EXPR
>> + || code == WIDEN_LSHIFT_EXPR
>> + || code == WIDEN_MULT_EXPR)
>> + modifier = WIDEN;
>
>Formatting nit: the last line should be indented by 6 spaces rather than 8.
Done.
>> @@ -4743,9 +4815,21 @@ vectorizable_conversion (vec_info *vinfo,
>> return false;
>>
>> case WIDEN:
>> - if (supportable_widening_operation (vinfo, code, stmt_info, vectype_out,
>> - vectype_in, &code1, &code2,
>> - &multi_step_cvt, &interm_types))
>> + if (known_eq (nunits_out, nunits_in)
>> + && (code == WIDEN_MINUS_EXPR
>> + || code == WIDEN_LSHIFT_EXPR
>> + || code == WIDEN_PLUS_EXPR
>> + || code == WIDEN_MULT_EXPR)
>> + && supportable_convert_operation (code, vectype_out, vectype_in,
>> + &code1))
>
>Guess this is personal taste, sorry, since it's clearly right both ways,
>but IMO it'd be better to drop the code test. We can only get here
>with nunits_out==nunits_in if we're converting a widening operation into
>a non-widening operation. If we do end up calling a separate function
>(as per the comment above), then it would abort in a meaningful place
>if something unexpected slips through.
Done.
>> + {
>> + gcc_assert (!(multi_step_cvt && op_type == binary_op));
>> + break;
>> + }
>> + else if (supportable_widening_operation (vinfo, code, stmt_info,
>> + vectype_out, vectype_in, &code1,
>> + &code2, &multi_step_cvt,
>> + &interm_types))
>> {
>> /* Binary widening operation can only be supported directly by the
>> architecture. */
>> @@ -4981,10 +5065,20 @@ vectorizable_conversion (vec_info *vinfo,
>> c1 = codecvt1;
>> c2 = codecvt2;
>> }
>> - vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
>> - &vec_oprnds1, stmt_info,
>> - this_dest, gsi,
>> - c1, c2, op_type);
>> + if ((code == WIDEN_MINUS_EXPR
>> + || code == WIDEN_PLUS_EXPR
>> + || code == WIDEN_LSHIFT_EXPR
>> + || code == WIDEN_MULT_EXPR)
>> + && known_eq (nunits_in, nunits_out))
>
>Same comment here about dropping the code tests.
Done.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-aarch64-vect-Support-V8QI-V8HI-WIDEN_-patterns.patch
Type: text/x-patch
Size: 12651 bytes
Desc: 0001-aarch64-vect-Support-V8QI-V8HI-WIDEN_-patterns.patch
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20210210/6de36f0f/attachment-0001.bin>
More information about the Gcc-patches
mailing list