This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [i386] Scalar DImode instructions on XMM registers

From: Ilya Enkovich <enkovich dot gnu at gmail dot com>
To: Vladimir Makarov <vmakarov at redhat dot com>
Cc: GCC Development <gcc at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>, Richard Henderson <rth at redhat dot com>, Jan Hubicka <hubicka at ucw dot cz>, Jeff Law <law at redhat dot com>
Date: Mon, 25 May 2015 18:27:16 +0300
Subject: Re: [i386] Scalar DImode instructions on XMM registers
Authentication-results: sourceware.org; auth=none
References: <CAMbmDYaDrCnDCnQfP0toV87pi_mE_pbPCP6M-FEkGNDAtWKFUA at mail dot gmail dot com> <CAFULd4amXWDT45oUNqi2cLL2Tec-kMJm7Kz301myZSWZw-3H7Q at mail dot gmail dot com> <alpine dot DEB dot 2 dot 11 dot 1504241222020 dot 1687 at laptop-mg dot saclay dot inria dot fr> <CAMbmDYYfq-RVYa0MwrGH_DpnV7psPHKZpxaouMuq_nsOPeO_ug at mail dot gmail dot com> <20150425013239 dot GB719 at atrey dot karlin dot mff dot cuni dot cz> <CAMbmDYbN7Zk9gg=UNRP3O8L8e5qxiK6jXi-SLEVDoMmBbqLXFQ at mail dot gmail dot com> <CAMbmDYY+a=LeqTcajW=g=e01q=R5ALykhFLwF0ypcAhKjnv0RA at mail dot gmail dot com> <555B552A dot 8010008 at redhat dot com> <20150520081738 dot GE47912 at msticlxl57 dot ims dot intel dot com> <555D50A3 dot 2000206 at redhat dot com> <20150521095442 dot GH47912 at msticlxl57 dot ims dot intel dot com> <555E2D37 dot 30107 at redhat dot com> <CAMbmDYZZs+q=5KhgG2-KP86eqkg2YSGjhUV9QYYf3J3_4POQ+A at mail dot gmail dot com> <CAMbmDYZ5=e4KAdUsCsd0ZdNrM94QKcBX6D9qiX0eMxsWcYyHMQ at mail dot gmail dot com>

2015-05-22 15:01 GMT+03:00 Ilya Enkovich <enkovich.gnu@gmail.com>:
> 2015-05-22 11:53 GMT+03:00 Ilya Enkovich <enkovich.gnu@gmail.com>:
>> 2015-05-21 22:08 GMT+03:00 Vladimir Makarov <vmakarov@redhat.com>:
>>> So, Ilya, to solve the problem you need to avoid sharing subregs for the
>>> correct LRA/reload work.
>>>
>>>
>>
>> Thanks a lot for your help! I'll fix it.
>>
>> Ilya
>
> I've fixed SUBREG sharing and got a missing spill. I added
> --enable-checking=rtl to check other possible bugs. Spill/fill code
> still seems incorrect because different sizes are used.  Shouldn't
> block me though.
>
> .L5:
>         movl    16(%esp), %eax
>         addl    $8, %esi
>         movl    20(%esp), %edx
>         movl    %eax, (%esp)
>         movl    %edx, 4(%esp)
>         call    counter@PLT
>         movq    -8(%esi), %xmm0
>         **movdqa  16(%esp), %xmm2**
>         pand    %xmm0, %xmm2
>         movdqa  %xmm2, %xmm0
>         movd    %xmm2, %edx
>         **movq    %xmm2, 16(%esp)**
>         psrlq   $32, %xmm0
>         movd    %xmm0, %eax
>         orl     %edx, %eax
>         jne     .L5
>
> Thanks,
> Ilya

I was wrong assuming reloads with wrong size shouldn't block me. These
reloads require memory to be aligned which is not always true. Here is
what I have in RTL now:

(insn 2 7 3 2 (set (reg/v:DI 93 [ l ])
        (mem/c:DI (reg/f:SI 16 argp) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
     (nil))
...
(insn 27 26 52 6 (set (subreg:V2DI (reg:DI 87 [ D.1822 ]) 0)
        (ior:V2DI (subreg:V2DI (reg:DI 99 [ D.1822 ]) 0)
            (subreg:V2DI (reg/v:DI 93 [ l ]) 0))) test.c:11 3489 {*iorv2di3}
     (expr_list:REG_DEAD (reg:DI 99 [ D.1822 ])
        (expr_list:REG_DEAD (reg/v:DI 93 [ l ])
            (nil))))

After reload I get:

(insn 2 7 75 2 (set (reg/v:DI 0 ax [orig:93 l ] [93])
        (mem/c:DI (plus:SI (reg/f:SI 7 sp)
                (const_int 24 [0x18])) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
     (nil))
(insn 75 2 3 2 (set (mem/c:DI (reg/f:SI 7 sp) [3 %sfp+-16 S8 A64])
        (reg/v:DI 0 ax [orig:93 l ] [93])) test.c:5 89 {*movdi_internal}
     (nil))
...
(insn 27 26 52 6 (set (reg:V2DI 21 xmm0 [orig:87 D.1822 ] [87])
        (ior:V2DI (reg:V2DI 21 xmm0 [orig:99 D.1822 ] [99])
            (mem/c:V2DI (reg/f:SI 7 sp) [3 %sfp+-16 S16 A64])))
test.c:11 3489 {*iorv2di3}


'por' instruction requires memory to be aligned and fails in a bigger
testcase. There is also movdqa generated for esp by reload. May it
mean I still have some inconsistencies in the produced RTL? Probably I
should somehow transform loads and stores?

Thanks,
Ilya

Attachment: ira.log
Description: Binary data

Attachment: pr65105.patch
Description: Binary data

extern long long arr[];

long long
test (long long l, int i1, int i2)
{
  switch (i2)
    {
    case 1:
      return l | arr[i1];
    case 8:
      return l | arr[i1] & arr[i2];
    }
  return l;
}

Follow-Ups:
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Jeff Law

References:
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Vladimir Makarov
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Vladimir Makarov
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Vladimir Makarov
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich
- Re: [i386] Scalar DImode instructions on XMM registers
  - From: Ilya Enkovich

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]