[i386] Scalar DImode instructions on XMM registers

Jeff Law law@redhat.com
Wed May 27 03:31:00 GMT 2015


On 05/25/2015 09:27 AM, Ilya Enkovich wrote:
> 2015-05-22 15:01 GMT+03:00 Ilya Enkovich <enkovich.gnu@gmail.com>:
>> 2015-05-22 11:53 GMT+03:00 Ilya Enkovich <enkovich.gnu@gmail.com>:
>>> 2015-05-21 22:08 GMT+03:00 Vladimir Makarov <vmakarov@redhat.com>:
>>>> So, Ilya, to solve the problem you need to avoid sharing subregs for the
>>>> correct LRA/reload work.
>>>>
>>>>
>>>
>>> Thanks a lot for your help! I'll fix it.
>>>
>>> Ilya
>>
>> I've fixed SUBREG sharing and got a missing spill. I added
>> --enable-checking=rtl to check other possible bugs. Spill/fill code
>> still seems incorrect because different sizes are used.  Shouldn't
>> block me though.
>>
>> .L5:
>>          movl    16(%esp), %eax
>>          addl    $8, %esi
>>          movl    20(%esp), %edx
>>          movl    %eax, (%esp)
>>          movl    %edx, 4(%esp)
>>          call    counter@PLT
>>          movq    -8(%esi), %xmm0
>>          **movdqa  16(%esp), %xmm2**
>>          pand    %xmm0, %xmm2
>>          movdqa  %xmm2, %xmm0
>>          movd    %xmm2, %edx
>>          **movq    %xmm2, 16(%esp)**
>>          psrlq   $32, %xmm0
>>          movd    %xmm0, %eax
>>          orl     %edx, %eax
>>          jne     .L5
>>
>> Thanks,
>> Ilya
>
> I was wrong assuming reloads with wrong size shouldn't block me. These
> reloads require memory to be aligned which is not always true. Here is
> what I have in RTL now:
>
> (insn 2 7 3 2 (set (reg/v:DI 93 [ l ])
>          (mem/c:DI (reg/f:SI 16 argp) [1 l+0 S8 A32])) test.c:5 89
> {*movdi_internal}
>       (nil))
> ...
> (insn 27 26 52 6 (set (subreg:V2DI (reg:DI 87 [ D.1822 ]) 0)
>          (ior:V2DI (subreg:V2DI (reg:DI 99 [ D.1822 ]) 0)
>              (subreg:V2DI (reg/v:DI 93 [ l ]) 0))) test.c:11 3489 {*iorv2di3}
>       (expr_list:REG_DEAD (reg:DI 99 [ D.1822 ])
>          (expr_list:REG_DEAD (reg/v:DI 93 [ l ])
>              (nil))))
>
> After reload I get:
>
> (insn 2 7 75 2 (set (reg/v:DI 0 ax [orig:93 l ] [93])
>          (mem/c:DI (plus:SI (reg/f:SI 7 sp)
>                  (const_int 24 [0x18])) [1 l+0 S8 A32])) test.c:5 89
> {*movdi_internal}
>       (nil))
> (insn 75 2 3 2 (set (mem/c:DI (reg/f:SI 7 sp) [3 %sfp+-16 S8 A64])
>          (reg/v:DI 0 ax [orig:93 l ] [93])) test.c:5 89 {*movdi_internal}
>       (nil))
> ...
> (insn 27 26 52 6 (set (reg:V2DI 21 xmm0 [orig:87 D.1822 ] [87])
>          (ior:V2DI (reg:V2DI 21 xmm0 [orig:99 D.1822 ] [99])
>              (mem/c:V2DI (reg/f:SI 7 sp) [3 %sfp+-16 S16 A64])))
> test.c:11 3489 {*iorv2di3}
>
>
> 'por' instruction requires memory to be aligned and fails in a bigger
> testcase. There is also movdqa generated for esp by reload. May it
> mean I still have some inconsistencies in the produced RTL? Probably I
> should somehow transform loads and stores?
I'd start by looking at the AP->SP elimination step.  What's the defined 
stack alignment and whether or not a dynamic stack realignment is 
needed.  If you don't have all that setup properly prior to the 
allocators, then they're not going to know how what objects to align nor 
how to align them.

jeff



More information about the Gcc mailing list