This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: PR target/40470: unabl e to find a register to spill in class ‘SSE_FIRST_REG’


H.J. Lu wrote:
> On Wed, Jun 17, 2009 at 11:34 AM, Jeff Law<law@redhat.com> wrote:
>> H.J. Lu wrote:
>>> On Wed, Jun 17, 2009 at 11:18 AM, Vladimir Makarov<vmakarov@redhat.com>
>>> wrote:
>>>
>>>> I am agree with Jeff and Richard.  There is one more reason to avoid
>>>> using
>>>> hard registers.  Usage of hard registers tends to create more spill
>>>> failures
>>>> in reload.
>>>>
>>> It is not like you have a choice here. The register for those insns is
>>> fixed.
>>> Sooner or later you have to allocate xmm0 for them.
>>>
>> And how is that different from any other port that has insns which require
>> specific registers for particular insns.  This is nothing new or uncommon.
>>
> 
> Have you compared generated codes on such insns with and
> without early hard register assignment? My observations are
> early hard register assignment improves RA on insn with
> fixed hard registers:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40480

  It is very possible there is a real problem in this area.  I recently(*) was
investigating the code generated by an inline asm which used the x86-specific
"a" constraint to force one of the operands into %eax.  I found that the
compiler generated a seemingly pointless spill-and-restore (effectively it
combined a dead store with a nop move!) unless I used a register asm to force
the operand into %eax early.  This code:

extern __inline__ long
ilockcmpexch (volatile long *t, long v, long c)
{
  return ({
		__typeof (*t) ret;
		__asm __volatile ("lock cmpxchgl %2, %1"
			: "=a" (ret), "=m" (*t)
			: "r" (v), "m" (*t), "0" (c)
			: "memory");
		ret;
	});
}

generated this assembly:

L186:
	.loc 3 127 0
	movl	__ZN13pthread_mutex7mutexesE+8, %eax	 # mutexes.head, D.28599
	movl	%eax, 36(%ebx)	 # D.28599, <variable>.next
	.loc 2 60 0
/APP
 # 60 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
	lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8	 # this,
 # 0 "" 2
/NO_APP
	movl	%eax, -12(%ebp)	 # tmp68, ret
	.loc 2 61 0
	movl	-12(%ebp), %eax	 # ret, D.28596
	.loc 3 126 0
	cmpl	%eax, 36(%ebx)	 # D.28596, <variable>.next
	jne	L186	 #,

  When I changed the declaration of 'ret' in the above:

-		__typeof (*t) ret;
+		register __typeof (*t) ret __asm ("%eax");

... I ended up getting this preferable code:

L186:
	.loc 3 127 0
	movl	__ZN13pthread_mutex7mutexesE+8, %eax
	movl	%eax, 36(%ebx)
	.loc 2 63 0
/APP
 # 63 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
	lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8
 # 0 "" 2
/NO_APP
	.loc 3 126 0
	cmpl	%eax, 36(%ebx)
	jne	L186

  Is it not a missed optimisation in the first case, that RA does not realise
'ret', 'tmp68' and 'D.28596' can all be placed in the same location?

    cheers,
      DaveK
-- 
(*) - http://www.cygwin.com/ml/cygwin-patches/2009-q2/msg00073.html


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]