[PATCH, i386] Add prefixes avoidance tuning for silvermont target

Uros Bizjak ubizjak@gmail.com
Thu Jul 3 11:11:00 GMT 2014


On Thu, Jul 3, 2014 at 12:45 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:

>>> Silvermont processors have penalty for instructions having 4+ bytes of prefixes (including escape
>>> bytes in opcode).  This situation happens when REX prefix is used in SSE4 instructions.  This
>>> patch tries to avoid such situation by preferring xmm0-xmm7 usage over xmm8-xmm15 in those
>>> instructions.  I achieved it by adding new tuning flag and new alternatives affected by tuning.
>>
>>> SSE4 instructions are not very widely used by GCC but I see some significant gains caused by
>>> this patch (tested on Avoton on -O3).
>>
>>> 2014-07-02  Ilya Enkovich  <ilya.enkovich@intel.com>
>>
>>> * config/i386/constraints.md (Yr): New.
>>> * config/i386/i386.h (reg_class): Add NO_REX_SSE_REGS.
>>> (REG_CLASS_NAMES): Likewise.
>>> (REG_CLASS_CONTENTS): Likewise.
>>> * config/i386/sse.md (*vec_concatv2sf_sse4_1): Add alternatives
>>> which use only NO_REX_SSE_REGS.
>>
>> You don't need to add alternatives, just change existing alternatives
>> from "x" to "Yr". The allocator will handle reduced register set just
>> fine.
>
> Hi,
>
> Thanks for review!
>
> My first patch version did such replacement. Performance results were
> OK but I got into stability issues due to peephole2 pass.  Peepholes
> may exchange operands of instructions and ignore register restrictions
> assuming all SSE registers are homogeneous.  It caused unrecognized
> instructions on some tests.  I preferred to add a new alternative
> instead of fixing peephole and possibly other similar problems.

No, please rather fix the peephole2 patterns. It is just a matter of
putting satisfies_constraint_Xx to their insn condition. In effect,
peephole2 pass is nullifying your optimization. Also, RA is still free
to allocate unwanted registers, even when prefixed with "?".

Uros.



More information about the Gcc-patches mailing list