This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: PR target/40470: unabl e to find a register to spill in class ‘SSE_FIRST_REG’

H.J. Lu wrote:
> On Wed, Jun 17, 2009 at 11:34 AM, Jeff Law<> wrote:
>> H.J. Lu wrote:
>>> On Wed, Jun 17, 2009 at 11:18 AM, Vladimir Makarov<>
>>> wrote:
>>>> I am agree with Jeff and Richard.  There is one more reason to avoid
>>>> using
>>>> hard registers.  Usage of hard registers tends to create more spill
>>>> failures
>>>> in reload.
>>> It is not like you have a choice here. The register for those insns is
>>> fixed.
>>> Sooner or later you have to allocate xmm0 for them.
>> And how is that different from any other port that has insns which require
>> specific registers for particular insns.  This is nothing new or uncommon.
> Have you compared generated codes on such insns with and
> without early hard register assignment? My observations are
> early hard register assignment improves RA on insn with
> fixed hard registers:

  It is very possible there is a real problem in this area.  I recently(*) was
investigating the code generated by an inline asm which used the x86-specific
"a" constraint to force one of the operands into %eax.  I found that the
compiler generated a seemingly pointless spill-and-restore (effectively it
combined a dead store with a nop move!) unless I used a register asm to force
the operand into %eax early.  This code:

extern __inline__ long
ilockcmpexch (volatile long *t, long v, long c)
  return ({
		__typeof (*t) ret;
		__asm __volatile ("lock cmpxchgl %2, %1"
			: "=a" (ret), "=m" (*t)
			: "r" (v), "m" (*t), "0" (c)
			: "memory");

generated this assembly:

	.loc 3 127 0
	movl	__ZN13pthread_mutex7mutexesE+8, %eax	 # mutexes.head, D.28599
	movl	%eax, 36(%ebx)	 # D.28599, <variable>.next
	.loc 2 60 0
 # 60 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
	lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8	 # this,
 # 0 "" 2
	movl	%eax, -12(%ebp)	 # tmp68, ret
	.loc 2 61 0
	movl	-12(%ebp), %eax	 # ret, D.28596
	.loc 3 126 0
	cmpl	%eax, 36(%ebx)	 # D.28596, <variable>.next
	jne	L186	 #,

  When I changed the declaration of 'ret' in the above:

-		__typeof (*t) ret;
+		register __typeof (*t) ret __asm ("%eax");

... I ended up getting this preferable code:

	.loc 3 127 0
	movl	__ZN13pthread_mutex7mutexesE+8, %eax
	movl	%eax, 36(%ebx)
	.loc 2 63 0
 # 63 "/gnu/winsup/src/winsup/cygwin/winbase.h" 1
	lock cmpxchgl %ebx, __ZN13pthread_mutex7mutexesE+8
 # 0 "" 2
	.loc 3 126 0
	cmpl	%eax, 36(%ebx)
	jne	L186

  Is it not a missed optimisation in the first case, that RA does not realise
'ret', 'tmp68' and 'D.28596' can all be placed in the same location?

(*) -

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]