This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: a strange infelicity of register allocation

To: law at cygnus dot com
Subject: Re: a strange infelicity of register allocation
From: Zack Weinberg <zack at rabi dot columbia dot edu>
Date: Mon, 25 Jan 1999 16:30:54 -0500
cc: Joern Rennecke <amylaar at cygnus dot co dot uk>, egcs at egcs dot cygnus dot com

On Mon, 25 Jan 1999 12:56:16 -0700, Jeffrey A Law wrote:
>
>  In message <199901251946.OAA06903@blastula.phys.columbia.edu>you write:
>
>  > The generated code is a little silly even when adjusted:
[...]
>The second sequence may actually be slower though.  The use of %eax and %al
>may trigger an interlock.  Someone would need to check.
>
>If we spilled, then we're probably going to lose regardless.  That's why
>I'd prefer to see us work on avoid the spills to start with by tightening
>up the x86 machine description.

The code is actually much worse than I thought, even when tweaked.

Input:

unsigned char *ip, *op;
for(;;)
{
    unsigned char c;
    c = *ip++;
    switch(c)
    {
	default: *op++ = c; break;
	case '\0': goto eof;
	case '\r': /* ... */ break;
	case '\n': /* ... */ break;
	case '?':  /* ... */ break;
    }
}
eof:

Current snapshot, -O2:

switch:
	movb (%esi),%al
	movb %al,-4125(%ebp)
	incl %esi
	xorl %eax,%eax
	movb -4125(%ebp),%al
	cmpl $10,%eax
	je case_one
	jg case_two_or_three
	testl %eax,%eax
	je case_four
	jmp default
	.p2align 4,,7
case_two_or_three:
	cmpl $13,%eax
	je case_two
	cmpl $63,%eax
	je case_three
default:
	movb -4125(%ebp),%dl
	movb %dl,(%ebx)
	incl %ebx
	jmp switch
	.p2align 4,,7
...

-4125(%ebp) will be in cache, of course, but this is still pretty
disgusting.

There is questionable code elsewhere, such as:

	decl -4112(%ebp)
	movl -4112(%ebp),%ecx

- if this were written the other way around, you'd avoid an address
generation, and a read-mod-write to memory; note that this particular
location is unlikely to be in L1.  Also the code would be smaller.
This looks like the problem Marc Espie keeps complaining about.

.L331:
	cmpb $63,-4125(%ebp)
	jne .L349
	movb 1(%esi),%cl
	movb %cl,-4125(%ebp)
	testb %cl,%cl
	jne .L333
	cmpl $0,-4120(%ebp)
	jne .L333
	decl -4112(%ebp)
	movl -4112(%ebp),%eax
	movb $63,(%eax)
	decl %eax
	movl %eax,-4112(%ebp)
	movb $63,(%eax)
	jmp .L311
	.p2align 4,,7
.L333:
	movzbl -4125(%ebp),%edi
	cmpb $0,trigraph_table(%edi)
	jne .L334

Here the store to -4125(%ebp) would be dead if it weren't used in the
movzbl at .L333.  I guess this is what Joern said about reloads not
inheriting down branches taken.

C:
	buf = xrealloc (buf, op - buf);
	fp->buf = buf;
	return op - buf;

becomes

	movl %ebx,%eax
	subl -4104(%ebp),%eax
	pushl %eax
	movl -4104(%ebp),%edx
	pushl %edx
	call xrealloc
	movl %eax,-4104(%ebp)
	movl 12(%ebp),%eax
	movl -4104(%ebp),%ecx
	movl %ecx,(%eax)
	subl %ecx,%ebx
	movl %ebx,%eax
	jmp epilogue

This could have been done

	movl -4104(%ebp), %eax
	subl %eax, %ebx
	pushl %ebx
	pushl %eax
	call xrealloc
	movl 12(%ebp), %ecx
	movl %eax, (%ecx)
	movl %ebx, %eax
	jmp epilogue

I think the key issue here is that it doesn't seem to know to preserve
values in call-saved registers over a function call.  It's somewhat
silly to be microoptimizing around a call to xrealloc, but this sort
of code is all over the place.

zw

Follow-Ups:
- Re: a strange infelicity of register allocation
  - From: Jeffrey A Law
- Re: a strange infelicity of register allocation
  - From: Joern Rennecke

References:
- Re: a strange infelicity of register allocation
  - From: Jeffrey A Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]