[PATCH] fix hardreg_cprop to honor HARD_REGNO_MODE_OK.

Jeff Law law@redhat.com
Thu Sep 25 19:14:00 GMT 2014


On 09/01/14 04:29, Ilya Tocar wrote:
>>>
>>> AVX512 added new 16 xmm registers (xmm16-xmm31).
>>> Those registers require evex encoding.
>>> Only 512-bit wide versions of instructions have evex encoding with
>>> avx512f, but all versions have it with avx512vl.
>>> Most instructions have same macroized pattern for 128/256/512 vector
>>> length. They all use constraint 'v', which corresponds to
>>> class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in
>>> 256-bit case (avx512f) and allow it only in avx512vl case we have
>>> HARD_REGNO_MODE_OK checking for regno being evex-only and
>>> disallowing it if mode is not 512-bit.
>> Generally this kind of thing has been handled by splitting the register
>> class into two classes.  I strongly suspect there are numerous places where
>> we assume that two regs in the same class are interchangeable.
> I'm not sure that there are many places where we replace hard regs
> without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK.
> As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we
> should always check when substituting hard reg. Why is regcprop
> different, and what's the point of HARD_REGNO_MODE_OK if it is ignored
> by some passes?
>
>>
>> I realize that's going to require some work in the x86 machine description,
>> but I think that's going to be a much better approach and save you work in
>> the long run.
>>
>
> This will approximately double sse.md, as we will need to split all
> patterns with 512-bit versions in 2 (512 and 128/256 cases) and play
> games with enabling/disabling alternatives depending on flags.
> Are you sure that this better than honoring HARD_REGNO_MODE_OK?
> As far as I understand, honoring  HARD_REGNO_MODE_OK shouldn't produce
> worse code.
I don't see how it doubles the size.  You split the class into two 
classes.  Whatever letter your second class has, you use it in 
conjunction with 'v' that you're already using.  Note you do not need 
different alternatives, you use them in the same alternative.

It's not a question of performance, but of design.  I suspect you're 
really just at the tip of the iceberg with this stuff if you continue to 
go down the path of having registers in the same class, some of which 
are allocatable and some of which are not.

The other approach that I believe has been taken has been to mark the 
new registers as fixed when compiling for hardware where they're not 
available.  But I'm not sure offhand if that would be sufficient to fix 
this problem.


Jeff



More information about the Gcc-patches mailing list