sh-elf: address inheritance hoisted above loop

tm tm@mail.kloo.net
Mon Dec 9 13:28:00 GMT 2002


I've been looking at quantize.i from stress-1.17 lately, and I've found
some disturbing code.

In the function Assignment(), there are two nested loops:

    for (i=0; i < (int) image->packets; i++)
    {
      node_info=cube_info->root;
      for (index= 8 -1; (int) index > 0; index--)
      {
        id=((((unsigned int) ( p->red ))  >> index) & 0x01) << 2 |
           ((((unsigned int) ( p->green ))  >> index) & 0x01) << 1 |
           ((((unsigned int) ( p->blue ))  >> index) & 0x01);
        if ((node_info->census & (1 << id)) == 0)
          break;
        node_info=node_info->child[id];
      }
      cube_info->color.red=p->red;
      cube_info->color.green=p->green;
      cube_info->color.blue=p->blue;
      cube_info->distance=3.0*(255 +1)*(255 +1);
      ClosestColor(cube_info,node_info->parent);
      index=cube_info->color_number;
      if (image->class == PseudoClass)
        p->index=index;
      else
        {
          p->red=image->colormap[index].red;
          p->green=image->colormap[index].green;
          p->blue=image->colormap[index].blue;
        }
      p++;
      if ((((~(( image->packets )- i -1) & (( image->packets ) - i
          - 2)) + 1) == (( image->packets ) - i - 1)) )
        ProgressMonitor("  Assigning image 
colors...  " ,i,image->packets);
    }

Using the 3.4-BIB compiler with options -O2 -m4 -ml, the following code is
generated:

.L62:
	mov.l	@(20,r1),r1
	mov	#0,r2
	cmp/ge	r1,r2
	bt/s	.L11
	mov.l	r2,@r14
	mov	r11,r0		<- here
	mov	r11,r3		<- here
	add	#13,r0		<- here
	mov	r11,r1		<- here
	mov.l	r0,@(24,r14)
	add	#12,r3		<- here
	mov.w	.L45,r0
	add	#14,r1		<- here
	mov.l	r3,@(20,r14)
	mov	r11,r2		<- here
	mov	r11,r3		<- here
	add	r13,r0
	add	#32,r2		<- here
	mov.l	r1,@(28,r14)
	add	#68,r3		<- here
	mov.l	r2,@(12,r14)
	mov.l	r3,@(16,r14)
	mov.l	r0,@(4,r14)
	mov.l	r8,@(8,r14)
.L23:
	...
.L19:
	...
	mov.l	@(20,r14),r1
	mov.b	r12,@r1
	...
	mov.l	@(24,r14),r2
	mov.b	r1,@r2
	...
	mov.l	@(28,r14),r3
	mov.b	r1,@r3
	...
	mov.l	@(12,r14),r1
	...
	mov.l	@(16,r14),r3
	mov.w	@r3,r1
	...
	bf/s	.L19		<- inner loop branch
	mov.l	@(48,r3),r7
	...
	bf/s	.L23		<- outer loop branch
	mov.l	r0,@r14

The loop optimizer appears to hoistrelated addresses out of a loop,
which causes it the register allocator to generate an extra 5 memory reads
inside the loop which really aren't necessary, because they are addresses
relative to r11.

Toshi




More information about the Gcc-bugs mailing list