[PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

Richard Biener richard.guenther@gmail.com
Tue Sep 15 11:47:00 GMT 2015


On Fri, Sep 11, 2015 at 6:29 PM, Ramana Radhakrishnan
<ramana.radhakrishnan@foss.arm.com> wrote:
>
>>>> Saying that all reductions have equivalent performance is unlikely to be
>>>> true for many platforms.  On PowerPC, for example, a PLUS reduction has
>>>> very different cost from a MAX reduction.  If the model isn't
>>>> fine-grained enough, let's please be aggressive about fixing it.  I'm
>>>> fine if it's a separate patch, but in my mind this shouldn't be allowed
>>>> to languish.
>>>
>>> ...I agree that the general vectoriser cost model could probably be
>>> improved, but it seems fairer for that improvement to be done by whoever
>>> adds the patterns that need it.
>>
>> All right.  But in response to Ramana's comment, are all relevant
>> reductions of similar cost on each ARM platform?  Obviously they don't
>> have the same cost on different platforms, but the question is whether a
>> reduc_plus, reduc_max, etc., has identical cost on each individual
>> platform.  If not, ARM may have a concern as well.  I don't know the
>> situation for x86 either.
>
> From cauldron I have a note that we need to look at the vectorizer cost model
> for both the ARM and AArch64 backends and move away from
> the set of magic constants that it currently returns.

Indeed.

> On AArch32, all the reduc_ patterns are emulated with pair-wise operations
> while on AArch64 they aren't. Thus they aren't likely to be the same cost as a
> standard vector arithmetic instruction. What difference this makes in practice
> remains to be seen, however the first step is moving towards the newer vectorizer
> cost model interface.
>
> I'll put this on a list of things for us to look at but I'm not sure who/when
> will get around to looking at this.

Note that the target should be able to "see" the cond reduction via the
add_stmt_cost hook calls.  It should see 'where' as the epilogue and
'kind' as a hint - recording stmt_info which should have sufficient info
for a good guess.  At finish_cost () time the target can compute a proper
overall cost.

Yes, the vectorizer "IL" (the stmt-infos) isn't very powerful and esp. for
code not corresponding to existing gimple stmts it doesn't even exist.
We need to improve in that area to better represent the desired transform.

Richard.

> regards
> Ramana



More information about the Gcc-patches mailing list