[PATCH] vectorizing conditional expressions (PR tree-optimization/65947)
Tue Sep 15 11:47:00 GMT 2015
On Fri, Sep 11, 2015 at 6:29 PM, Ramana Radhakrishnan
>>>> Saying that all reductions have equivalent performance is unlikely to be
>>>> true for many platforms. On PowerPC, for example, a PLUS reduction has
>>>> very different cost from a MAX reduction. If the model isn't
>>>> fine-grained enough, let's please be aggressive about fixing it. I'm
>>>> fine if it's a separate patch, but in my mind this shouldn't be allowed
>>>> to languish.
>>> ...I agree that the general vectoriser cost model could probably be
>>> improved, but it seems fairer for that improvement to be done by whoever
>>> adds the patterns that need it.
>> All right. But in response to Ramana's comment, are all relevant
>> reductions of similar cost on each ARM platform? Obviously they don't
>> have the same cost on different platforms, but the question is whether a
>> reduc_plus, reduc_max, etc., has identical cost on each individual
>> platform. If not, ARM may have a concern as well. I don't know the
>> situation for x86 either.
> From cauldron I have a note that we need to look at the vectorizer cost model
> for both the ARM and AArch64 backends and move away from
> the set of magic constants that it currently returns.
> On AArch32, all the reduc_ patterns are emulated with pair-wise operations
> while on AArch64 they aren't. Thus they aren't likely to be the same cost as a
> standard vector arithmetic instruction. What difference this makes in practice
> remains to be seen, however the first step is moving towards the newer vectorizer
> cost model interface.
> I'll put this on a list of things for us to look at but I'm not sure who/when
> will get around to looking at this.
Note that the target should be able to "see" the cond reduction via the
add_stmt_cost hook calls. It should see 'where' as the epilogue and
'kind' as a hint - recording stmt_info which should have sufficient info
for a good guess. At finish_cost () time the target can compute a proper
Yes, the vectorizer "IL" (the stmt-infos) isn't very powerful and esp. for
code not corresponding to existing gimple stmts it doesn't even exist.
We need to improve in that area to better represent the desired transform.
More information about the Gcc-patches