This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: strength reduction example


> With Joern's code, we get instead:
> 
> .L5:
>         decl -8(%ebp)
>         jz .L4
> .L3:
>         movb -3(%edi),%dl
>         movb %dl,(%eax)
>         testb %dl,%dl
>         je .L4
>         movb (%ebx),%dl
>         leal 1(%eax),%esi
>         movb %dl,1(%eax)
>         testb %dl,%dl
>         je .L4
>         movl -12(%ebp),%ebx
>         movb (%ebx),%dl
>         movb %dl,2(%eax)
>         testb %dl,%dl
>         je .L4
>         movb (%edi),%dl
>         addl $4,%ebx
>         movl %ebx,-12(%ebp)
>         addl $4,%ecx
>         leal 3(%ecx),%edi
>         leal 1(%ecx),%ebx
>         addl $4,%eax
>         movb %dl,-2(%esi)
>         testb %dl,%dl
>         jne .L5
> .L4:

regmove can convert from pointer+offset references to pointer arithmetic, 
but can it do the inverse; e.g. convert from pointer arithmetic back to 
pointer+offset references?

This may be desirable for other reasons as well. Given two instructions
on an architecture with postincrement:

	mov.l	@r0+,r1
	mov.l	@r0,r2
	add	r3,r2

the scheduler seems unable to hoist the second instruction above the 
first due to r0 dependencies, even if the memory load latency is greater 
than the cost of the the address arithmetic. If it were possible to 
convert to a pointer+offset reference, then the scheduling could be
improved to:

	mov.l	@(4,r0),r2
	mov.l	@r0,r1
	add	r3,r2
	(add    #4,r0)

Speaking of scheduler problems, it would be nice if the scheduler was
able to hoist more than one instruction in certain circumstances; e.g. 
occasionally I see:

	mov	#30,r0
	mov.l	@(r0,r1),r2
	mov	#78,r0
	mov.l	@(r0,r1),r3
	add	r4,r3

The scheduler seems unable to move the second read because it's dependent 
on the "mov #78,r0". If the scheduler could move both instructions, then 
the following code would be generated:

	mov	#78,r0
	mov.l	@(r0,r1),r3
	mov	#30,r0
	mov.l	@9r0,r1),2
	add	r4,r3

which hides the register load latency much better.

Toshi



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]