This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: PR target/40470: unabl e to find a register to spill in class ‘SSE_FIRST_REG’
H.J. Lu wrote:
> On Wed, Jun 17, 2009 at 11:34 AM, Jeff Law<law@redhat.com> wrote:
>> H.J. Lu wrote:
>>> On Wed, Jun 17, 2009 at 11:18 AM, Vladimir Makarov<vmakarov@redhat.com>
>>> wrote:
>>>
>>>> I am agree with Jeff and Richard. There is one more reason to avoid
>>>> using
>>>> hard registers. Usage of hard registers tends to create more spill
>>>> failures
>>>> in reload.
>>>>
>>> It is not like you have a choice here. The register for those insns is
>>> fixed.
>>> Sooner or later you have to allocate xmm0 for them.
>>>
>> And how is that different from any other port that has insns which require
>> specific registers for particular insns. This is nothing new or uncommon.
>>
>
> Have you compared generated codes on such insns with and
> without early hard register assignment? My observations are
> early hard register assignment improves RA on insn with
> fixed hard registers:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40480
It is very possible there is a real problem in this area. I recently(*) was
investigating the code generated by an inline asm which used the x86-specific
"a" constraint to force one of the operands into %eax. I found that the
compiler generated a seemingly pointless spill-and-restore (effectively it
combined a dead store with a nop move!) unless I used a register asm to force
the operand into %eax early. This code:
extern __inline__ long
ilockcmpexch (volatile long *t, long v, long c)
{
return ({
__typeof (*t) ret;
__asm __volatile ("lock cmpxchgl %2, %1"
: "=a" (ret), "=m" (*t)
: "r" (v), "m" (*t), "0" (c)
: "memory");
ret;
});
}
generated this assembly:
L186:
.loc 3 127 0
movl __ZN13pthread_mutex7mutexesE+8, %eax # mutexes.head, D.28599
movl %eax, 36(%ebx) # D.28599, <variable>.next
.loc 2 60 0
/APP
# 60 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8 # this,
# 0 "" 2
/NO_APP
movl %eax, -12(%ebp) # tmp68, ret
.loc 2 61 0
movl -12(%ebp), %eax # ret, D.28596
.loc 3 126 0
cmpl %eax, 36(%ebx) # D.28596, <variable>.next
jne L186 #,
When I changed the declaration of 'ret' in the above:
- __typeof (*t) ret;
+ register __typeof (*t) ret __asm ("%eax");
... I ended up getting this preferable code:
L186:
.loc 3 127 0
movl __ZN13pthread_mutex7mutexesE+8, %eax
movl %eax, 36(%ebx)
.loc 2 63 0
/APP
# 63 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8
# 0 "" 2
/NO_APP
.loc 3 126 0
cmpl %eax, 36(%ebx)
jne L186
Is it not a missed optimisation in the first case, that RA does not realise
'ret', 'tmp68' and 'D.28596' can all be placed in the same location?
cheers,
DaveK
--
(*) - http://www.cygwin.com/ml/cygwin-patches/2009-q2/msg00073.html