This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: combine permutations in gimple


[It looks like I missed hitting the send button on this response]

>
> Seems to be one instruction shorter at least ;-) Yes, there can be much
> worse regressions than that because of the patch (like 40 instructions
> instead of 4, in the x86 backend).

If this is replacing 4 instructions with 40 in x86 backend maybe
someone will notice :)

Not a win in this particular testcase  because the compiler replaces 2
constant permutes ( that's about 4 cycles) with a load from the
constant pool , a generic permute and in addition are polluting the
icache with guff in the constant pool .  If you go to 3 -4 permutes
into a single one then it might be a win but not till that point.


> with a-b without first asking the backend whether it might be more
> efficient. One permutation is better than 2.
>  It just happens that the range
> of possible permutations is too large (and the basic instructions are too
> strange) for backends to do a good job on them, and thus keeping toplevel
> input as a hint is important.

Of-course, the problem here is this change of semantics with the hook
TARGET_VEC_PERM_CONST_OK. Targets were expanding to generic permutes
with constants in the *absence* of being able to deal with them with
the specialized permutes.  fwprop will now leave us at a point where
each target has to now grow more knowledge with respect to how best to
expand a generic constant permute with a sequence of permutes rather
than just using the generic permute.

Generating a sequence of permutes from a single constant permute will
be a  harder problem than (say) dealing with immediate expansions so
you are pushing more complexity into the target but in the short term
target maintainers should definitely have a heads up that folks using
vector permute intrinsics could actually have performance regressions
on their targets.

Thanks,
Ramana


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]