Bug 32871 - [avr] Bad optimisation - gcc is pushing too many registers
Summary: [avr] Bad optimisation - gcc is pushing too many registers
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.2.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2007-07-23 19:22 UTC by Michael H.
Modified: 2010-01-29 17:01 UTC (History)
3 users (show)

See Also:
Host: Linux - Slax
Target: avr
Build: Linux - Slax
Known to work:
Known to fail: 4.2.1 4.2.2 4.3.0
Last reconfirmed: 2008-02-19 02:45:07


Attachments
Patch to fix bug. (1.29 KB, patch)
2008-03-02 23:32 UTC, Andy Hutchinson
Details | Diff
Partial solution using DF defs. (1.44 KB, patch)
2008-04-28 00:58 UTC, Andy Hutchinson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael H. 2007-07-23 19:22:48 UTC
Let's look at this:

long foo(long a, long b, long c, uint8_t d){
  if(d){
    return a+b;
  }else{
    return a-c;
  }
}

The listing reports this:
long foo(long a, long b, long c, uint8_t d){
  4e:   cf 92          push   r12 ;All this registers are pushed
  50:   ef 92          push   r14 ;despite it's unessecary
  52:   ff 92          push   r15 ;
  54:   0f 93          push   r16
  56:   1f 93          push   r17
  if(d){
  58:   cc 20          and   r12, r12
  5a:   29 f0          breq   .+10        ; 0x66 <foo+0x18>
    return a+b;
  5c:   62 0f          add   r22, r18
  5e:   73 1f          adc   r23, r19
  60:   84 1f          adc   r24, r20
  62:   95 1f          adc   r25, r21
  64:   04 c0          rjmp   .+8         ; 0x6e <foo+0x20>
  }else{
    return a-c;
  66:   6e 19          sub   r22, r14
  68:   7f 09          sbc   r23, r15
  6a:   80 0b          sbc   r24, r16
  6c:   91 0b          sbc   r25, r17
  6e:   1f 91          pop   r17 ;And they are getting restored
  70:   0f 91          pop   r16 ;despite they are not changed.
  72:   ff 90          pop   r15
  74:   ef 90          pop   r14
  76:   cf 90          pop   r12
  78:   08 95          ret

During all operation in the low register (r3-r17) are always zero, and they are never changed in the hole file and even not in the function itself. So it's useless to push and pop them, we're only loosing time, space and ram.

Please excuse my bad bug-reporting-style. This is my first report. For further explainaition I can recomment you the german site, where this problem is beeing discussed.

http://www.roboternetz.de/phpBB2/viewtopic.php?p=300953

I hope you can fix this.

Michael
Comment 1 Michael H. 2007-07-29 16:12:19 UTC
Konsole:
=========================================================================
root@slax:/mnt/sda1_removable/avr/gcc_schlecht# make

-------- begin --------
* Individual makefile for AvrLiveCD
* Avr-Gcc version:
avr-gcc (GCC) 4.1.2
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

* ---------------
avr-size -d main.elf -t
avr-size: 'main.elf': No such file
      0       0       0       0       0 (TOTALS)

Compiling: main.c
avr-gcc -c -mmcu=attiny26 -I. -g -DF_CPU=1000000UL -I -Os -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -L /usr/local/bin/lib/gcc/avr/4.1.1 -Wa,-adhlns=main.lst -I/usr/local/bin/avr/include/ -std=gnu99 -MD -MP -MF .dep/main.o.d main.c -o main.o
main.c:32:2: warning: no newline at end of file

Linking: main.elf
avr-gcc -mmcu=attiny26 -I. -g -DF_CPU=1000000UL -I -Os -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -L /usr/local/bin/lib/gcc/avr/4.1.1 -Wa,-adhlns=main.o -I/usr/local/bin/avr/include/ -std=gnu99 -MD -MP -MF .dep/main.elf.d main.o --output main.elf -Wl,-Map=main.map,--cref

Creating load file for Flash: main.hex
avr-objcopy -O ihex -R .eeprom main.elf main.hex

Creating load file for EEPROM: main.eep
avr-objcopy -j .eeprom --set-section-flags .eeprom=alloc,load \
        --change-section-lma .eeprom=0 -O ihex main.elf main.eep
avr-objcopy: there are no sections to be copied!
avr-objcopy: --change-section-lma .eeprom=0x00000000 never used
make: [main.eep] Error 1 (ignored)

Creating Extended Listing: main.lss
avr-objdump -h -S main.elf > main.lss

Creating Symbol Table: main.sym
avr-nm -n main.elf > main.sym
avr-size -d main.elf -t
   text    data     bss     dec     hex filename
    228       0       0     228      e4 main.elf
    228       0       0     228      e4 (TOTALS)
-------- end --------

root@slax:/mnt/sda1_removable/avr/gcc_schlecht# make main.i
avr-gcc -E -mmcu=attiny26 -I. -g -DF_CPU=1000000UL -I -Os -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -L /usr/local/bin/lib/gcc/avr/4.1.1 -Wa,-adhlns=main.lst -I/usr/local/bin/avr/include/ -std=gnu99 main.c -o main.i
main.c:32:2: warning: no newline at end of file
root@slax:/mnt/sda1_removable/avr/gcc_schlecht# make main.s
avr-gcc -S -mmcu=attiny26 -I. -g -DF_CPU=1000000UL -I -Os -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -L /usr/local/bin/lib/gcc/avr/4.1.1 -Wa,-adhlns=main.lst -I/usr/local/bin/avr/include/ -std=gnu99 -MD -MP -MF .dep/main.s.d main.c -o main.s
main.c:32:2: warning: no newline at end of file
root@slax:/mnt/sda1_removable/avr/gcc_schlecht#
=======================================================

main.c
=======================================================
//General Avrincludes
#include <avr/io.h>


long foo(long a, long b, long c, uint8_t d){
  if(d){
    return a+b;
  }else{
    return a-c;
  }
}

long foo_rec(long a){
  if(a==4){
    return foo_rec(a-1)+2;
  }
  return 1;
}

long foo_rec2(long a, long b){
   if(!b){
     return foo_rec2(a+2,b+4);
   }else{
     return a+b+4;
   }
}

int main(void){


  return 0;
}
========================================================

The preprocessed file:
========================================================
# 1 "main.c"
# 1 "/mnt/sda1_removable/avr/gcc_schlecht//"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "main.c"

# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 1 3
# 87 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/sfr_defs.h" 1 3
# 126 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/sfr_defs.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/inttypes.h" 1 3
# 37 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/inttypes.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 1 3
# 121 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 3
typedef int int8_t __attribute__((__mode__(__QI__)));
typedef unsigned int uint8_t __attribute__((__mode__(__QI__)));
typedef int int16_t __attribute__ ((__mode__ (__HI__)));
typedef unsigned int uint16_t __attribute__ ((__mode__ (__HI__)));
typedef int int32_t __attribute__ ((__mode__ (__SI__)));
typedef unsigned int uint32_t __attribute__ ((__mode__ (__SI__)));

typedef int int64_t __attribute__((__mode__(__DI__)));
typedef unsigned int uint64_t __attribute__((__mode__(__DI__)));
# 142 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 3
typedef int16_t intptr_t;




typedef uint16_t uintptr_t;
# 159 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 3
typedef int8_t int_least8_t;




typedef uint8_t uint_least8_t;




typedef int16_t int_least16_t;




typedef uint16_t uint_least16_t;




typedef int32_t int_least32_t;




typedef uint32_t uint_least32_t;







typedef int64_t int_least64_t;






typedef uint64_t uint_least64_t;
# 213 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 3
typedef int8_t int_fast8_t;




typedef uint8_t uint_fast8_t;




typedef int16_t int_fast16_t;




typedef uint16_t uint_fast16_t;




typedef int32_t int_fast32_t;




typedef uint32_t uint_fast32_t;







typedef int64_t int_fast64_t;






typedef uint64_t uint_fast64_t;
# 273 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h" 3
typedef int64_t intmax_t;




typedef uint64_t uintmax_t;
# 38 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/inttypes.h" 2 3
# 77 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/inttypes.h" 3
typedef int32_t int_farptr_t;



typedef uint32_t uint_farptr_t;
# 127 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/sfr_defs.h" 2 3
# 88 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 2 3
# 312 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/iotn26.h" 1 3
# 313 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 2 3
# 360 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/portpins.h" 1 3
# 361 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 2 3
# 370 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 3
# 1 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/version.h" 1 3
# 371 "/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h" 2 3
# 3 "main.c" 2


long foo(long a, long b, long c, uint8_t d){
  if(d){
    return a+b;
  }else{
    return a-c;
  }
}

long foo_rec(long a){
  if(a==4){
    return foo_rec(a-1)+2;
  }
  return 1;
}

long foo_rec2(long a, long b){
   if(!b){
     return foo_rec2(a+2,b+4);
   }else{
     return a+b+4;
   }
}

int main(void){


  return 0;
}
==============================================================
The Assemblerfile:
==============================================================
	.file	"main.c"
	.arch attiny26
__SREG__ = 0x3f
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__tmp_reg__ = 0
__zero_reg__ = 1
	.global __do_copy_data
	.global __do_clear_bss
	.stabs	"/mnt/sda1_removable/avr/gcc_schlecht/",100,0,2,.Ltext0
	.stabs	"main.c",100,0,2,.Ltext0
	.text
.Ltext0:
	.stabs	"gcc2_compiled.",60,0,0,0
	.stabs	"int:t(0,1)=r(0,1);-32768;32767;",128,0,0,0
	.stabs	"char:t(0,2)=@s8;r(0,2);0;255;",128,0,0,0
	.stabs	"long int:t(0,3)=@s32;r(0,3);020000000000;017777777777;",128,0,0,0
	.stabs	"unsigned int:t(0,4)=r(0,4);0;0177777;",128,0,0,0
	.stabs	"long unsigned int:t(0,5)=@s32;r(0,5);0;037777777777;",128,0,0,0
	.stabs	"long long int:t(0,6)=@s64;r(0,6);01000000000000000000000;0777777777777777777777;",128,0,0,0
	.stabs	"long long unsigned int:t(0,7)=@s64;r(0,7);0;01777777777777777777777;",128,0,0,0
	.stabs	"short int:t(0,8)=r(0,8);-32768;32767;",128,0,0,0
	.stabs	"short unsigned int:t(0,9)=r(0,9);0;0177777;",128,0,0,0
	.stabs	"signed char:t(0,10)=@s8;r(0,10);-128;127;",128,0,0,0
	.stabs	"unsigned char:t(0,11)=@s8;r(0,11);0;255;",128,0,0,0
	.stabs	"float:t(0,12)=r(0,1);4;0;",128,0,0,0
	.stabs	"double:t(0,13)=r(0,1);4;0;",128,0,0,0
	.stabs	"long double:t(0,14)=r(0,1);4;0;",128,0,0,0
	.stabs	"void:t(0,15)=(0,15)",128,0,0,0
	.stabs	"/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/io.h",130,0,0,0
	.stabs	"/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/avr/sfr_defs.h",130,0,0,0
	.stabs	"/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/inttypes.h",130,0,0,0
	.stabs	"/usr/local/lib/gcc/avr/4.1.2/../../../../avr/include/stdint.h",130,0,0,0
	.stabs	"int8_t:t(4,1)=(0,10)",128,0,121,0
	.stabs	"uint8_t:t(4,2)=(0,11)",128,0,122,0
	.stabs	"int16_t:t(4,3)=(0,1)",128,0,123,0
	.stabs	"uint16_t:t(4,4)=(0,4)",128,0,124,0
	.stabs	"int32_t:t(4,5)=(0,3)",128,0,125,0
	.stabs	"uint32_t:t(4,6)=(0,5)",128,0,126,0
	.stabs	"int64_t:t(4,7)=(0,6)",128,0,128,0
	.stabs	"uint64_t:t(4,8)=(0,7)",128,0,129,0
	.stabs	"intptr_t:t(4,9)=(4,3)",128,0,142,0
	.stabs	"uintptr_t:t(4,10)=(4,4)",128,0,147,0
	.stabs	"int_least8_t:t(4,11)=(4,1)",128,0,159,0
	.stabs	"uint_least8_t:t(4,12)=(4,2)",128,0,164,0
	.stabs	"int_least16_t:t(4,13)=(4,3)",128,0,169,0
	.stabs	"uint_least16_t:t(4,14)=(4,4)",128,0,174,0
	.stabs	"int_least32_t:t(4,15)=(4,5)",128,0,179,0
	.stabs	"uint_least32_t:t(4,16)=(4,6)",128,0,184,0
	.stabs	"int_least64_t:t(4,17)=(4,7)",128,0,192,0
	.stabs	"uint_least64_t:t(4,18)=(4,8)",128,0,199,0
	.stabs	"int_fast8_t:t(4,19)=(4,1)",128,0,213,0
	.stabs	"uint_fast8_t:t(4,20)=(4,2)",128,0,218,0
	.stabs	"int_fast16_t:t(4,21)=(4,3)",128,0,223,0
	.stabs	"uint_fast16_t:t(4,22)=(4,4)",128,0,228,0
	.stabs	"int_fast32_t:t(4,23)=(4,5)",128,0,233,0
	.stabs	"uint_fast32_t:t(4,24)=(4,6)",128,0,238,0
	.stabs	"int_fast64_t:t(4,25)=(4,7)",128,0,246,0
	.stabs	"uint_fast64_t:t(4,26)=(4,8)",128,0,253,0
	.stabs	"intmax_t:t(4,27)=(4,7)",128,0,273,0
	.stabs	"uintmax_t:t(4,28)=(4,8)",128,0,278,0
	.stabn	162,0,0,0
	.stabs	"int_farptr_t:t(3,1)=(4,5)",128,0,77,0
	.stabs	"uint_farptr_t:t(3,2)=(4,6)",128,0,81,0
	.stabn	162,0,0,0
	.stabn	162,0,0,0
	.stabn	162,0,0,0
	.stabs	"foo:F(0,3)",36,0,5,foo
	.stabs	"a:P(0,3)",64,0,5,22
	.stabs	"b:P(0,3)",64,0,5,18
	.stabs	"c:P(0,3)",64,0,5,14
	.stabs	"d:P(4,2)",64,0,5,12
.global	foo
	.type	foo, @function
foo:
	.stabd	46,0,0
	.stabn	68,0,5,.LM0-foo
.LM0:
/* prologue: frame size=0 */
	push r12
	push r14
	push r15
	push r16
	push r17
/* prologue end (size=5) */
	.stabn	68,0,6,.LM1-foo
.LM1:
	tst r12
	breq .L2
	.stabn	68,0,7,.LM2-foo
.LM2:
	add r22,r18
	adc r23,r19
	adc r24,r20
	adc r25,r21
	rjmp .L4
.L2:
	.stabn	68,0,9,.LM3-foo
.LM3:
	sub r22,r14
	sbc r23,r15
	sbc r24,r16
	sbc r25,r17
.L4:
/* epilogue: frame size=0 */
	pop r17
	pop r16
	pop r15
	pop r14
	pop r12
	ret
/* epilogue end (size=6) */
/* function foo size 22 (11) */
	.size	foo, .-foo
.Lscope0:
	.stabs	"",36,0,0,.Lscope0-foo
	.stabd	78,0,0
	.stabs	"foo_rec:F(0,3)",36,0,13,foo_rec
	.stabs	"a:P(0,3)",64,0,13,22
.global	foo_rec
	.type	foo_rec, @function
foo_rec:
	.stabd	46,0,0
	.stabn	68,0,13,.LM4-foo_rec
.LM4:
/* prologue: frame size=0 */
/* prologue end (size=0) */
	.stabn	68,0,14,.LM5-foo_rec
.LM5:
	cpi r22,lo8(4)
	cpc r23,__zero_reg__
	cpc r24,__zero_reg__
	cpc r25,__zero_reg__
	breq .L8
	.stabn	68,0,14,.LM6-foo_rec
.LM6:
	ldi r22,lo8(0)
	ldi r23,hi8(0)
	ldi r24,hlo8(0)
	ldi r25,hhi8(0)
	rjmp .L10
.L8:
	ldi r22,lo8(2)
	ldi r23,hi8(2)
	ldi r24,hlo8(2)
	ldi r25,hhi8(2)
.L10:
	subi r22,lo8(-(1))
	sbci r23,hi8(-(1))
	sbci r24,hlo8(-(1))
	sbci r25,hhi8(-(1))
/* epilogue: frame size=0 */
	ret
/* epilogue end (size=1) */
/* function foo_rec size 19 (18) */
	.size	foo_rec, .-foo_rec
.Lscope1:
	.stabs	"",36,0,0,.Lscope1-foo_rec
	.stabd	78,0,0
	.stabs	"foo_rec2:F(0,3)",36,0,20,foo_rec2
	.stabs	"a:P(0,3)",64,0,20,22
	.stabs	"b:P(0,3)",64,0,20,18
.global	foo_rec2
	.type	foo_rec2, @function
foo_rec2:
	.stabd	46,0,0
	.stabn	68,0,20,.LM7-foo_rec2
.LM7:
/* prologue: frame size=0 */
/* prologue end (size=0) */
	.stabn	68,0,21,.LM8-foo_rec2
.LM8:
	cp r18,__zero_reg__
	cpc r19,__zero_reg__
	cpc r20,__zero_reg__
	cpc r21,__zero_reg__
	brne .L14
	.stabn	68,0,22,.LM9-foo_rec2
.LM9:
	subi r22,lo8(-(2))
	sbci r23,hi8(-(2))
	sbci r24,hlo8(-(2))
	sbci r25,hhi8(-(2))
	ldi r18,lo8(4)
	ldi r19,hi8(4)
	ldi r20,hlo8(4)
	ldi r21,hhi8(4)
.L14:
	subi r22,lo8(-(4))
	sbci r23,hi8(-(4))
	sbci r24,hlo8(-(4))
	sbci r25,hhi8(-(4))
	add r18,r22
	adc r19,r23
	adc r20,r24
	adc r21,r25
	.stabn	68,0,26,.LM10-foo_rec2
.LM10:
	mov r25,r21
	mov r24,r20
	mov r23,r19
	mov r22,r18
/* epilogue: frame size=0 */
	ret
/* epilogue end (size=1) */
/* function foo_rec2 size 26 (25) */
	.size	foo_rec2, .-foo_rec2
.Lscope2:
	.stabs	"",36,0,0,.Lscope2-foo_rec2
	.stabd	78,0,0
	.stabs	"main:F(0,1)",36,0,28,main
.global	main
	.type	main, @function
main:
	.stabd	46,0,0
	.stabn	68,0,28,.LM11-main
.LM11:
/* prologue: frame size=0 */
	ldi r28,lo8(__stack - 0)
	ldi r29,hi8(__stack - 0)
	out __SP_H__,r29
	out __SP_L__,r28
/* prologue end (size=4) */
	.stabn	68,0,32,.LM12-main
.LM12:
	ldi r24,lo8(0)
	ldi r25,hi8(0)
/* epilogue: frame size=0 */
	rjmp exit
/* epilogue end (size=1) */
/* function main size 7 (2) */
	.size	main, .-main
.Lscope3:
	.stabs	"",36,0,0,.Lscope3-main
	.stabd	78,0,0
	.stabs	"",100,0,0,.Letext0
.Letext0:
/* File "main.c": code   74 = 0x004a (  56), prologues   9, epilogues   9 */
=================================================

I hope this are all the necesarry files.

Michael
Comment 2 Eric Weddington 2008-02-19 02:45:07 UTC
Confirmed. 4.2.2 produces unnecessary pushes and pops. 4.3.0 causes worse code than 4.2.x and adds unnecessary moves. Adding const or pure function attributes do not seem to help in 4.3.0.
Comment 3 Andy Hutchinson 2008-03-02 17:22:16 UTC
Problem is caused by bug in gcc DF or at least incorrect documentation regarding prolog/epilog register save/resotres

As specified in internals manual AVR prolog/epilog uses df_regs_ever_live_p(reg) to determine which register should be saved on stack (if it is not call_used_register).

However, if function has arguments that are stored in non call_used_registers (R8-R17), then this test gives incorrect result and these registers will always saved/restored by prolog/epilog.  (Argument register never need to be saved/restored.)


This problem only applies to targets that pass arguments in non call_used_registers.

Unfortunately no part of gcc including DF appears to have proper information to use directly.

In the absence of a change to gcc, the target can determine which registers are REALLY used as arguments and exclude these from save/restores.  So it requires going thru all function arguments again using target argument macros.

Will post patch when it's finished testing. But here is key routine:

/* Returns HARD_REG_SET indicating which registers are used for arguments */

static void
avr_args (HARD_REG_SET *set)
{
    int reg;
    int i;
    rtx arg;
    CUMULATIVE_ARGS cum;

    tree decl = DECL_ARGUMENTS (current_function_decl);
    INIT_CUMULATIVE_ARGS (cum, TREE_TYPE (current_function_decl), NULL_RTX, decl, -1);
    
    for (; decl; decl = TREE_CHAIN (decl))
    {
        if ( TREE_CODE (decl) == PARM_DECL
        && DECL_NAME (decl) && !DECL_ARTIFICIAL (decl))  
       {
           enum machine_mode mode = DECL_MODE (decl);
            /* Get argument RTX */
            /* This target does not use named attribute */
            arg = FUNCTION_ARG (cum, mode, DECL_ARG_TYPE (decl), 1);
            FUNCTION_ARG_ADVANCE (cum, mode, DECL_ARG_TYPE (decl), 1);
            if REG_P(arg)
            {
                reg = REGNO (arg);
                for (i = 0;i < HARD_REGNO_NREGS (reg, mode);i++)
                {
                  if (set)
                      SET_HARD_REG_BIT (*set, reg + i);  
                }
            }
	}
    }
}
Comment 4 Andy Hutchinson 2008-03-02 23:32:24 UTC
Created attachment 15254 [details]
Patch to fix bug.
Comment 5 Eric Weddington 2008-04-23 22:55:13 UTC
Patch causes wrong code regression. See WinAVR bug #1945375 on SourceForge:
<http://sourceforge.net/tracker/index.php?func=detail&aid=1945375&group_id=68108&atid=520074>
Comment 6 Andy Hutchinson 2008-04-28 00:58:11 UTC
Created attachment 15540 [details]
Partial solution using DF defs.
Comment 7 Andy Hutchinson 2008-04-28 00:59:15 UTC
Attached is INCOMPLETE attempt to fix this issue.

Register saves appear to be ok. But same function is required for Argument pointer elimination offset. It would appear DF chain info is not maintained, when global.c  uses this. So offset used to access arguments on stack does not reflect final value required and will fail.
Comment 8 Andy Hutchinson 2008-05-05 01:15:13 UTC
The following information from Kenny Zadeck, shows why the solution does not work. This limitation is not avoidable at the present time without causing compilation time/memory regressions on other targets.  So we will have to live with the overly cautious saving of registers.


> The target computes offset (INITIAL_ELIMINATION_OFFSET). This is called several times during register allocation (no doubt because something changes). Offset is a function of the number of registers saved. So I used DF_REG_DEF_CHAIN to work out precisely saved registers. But this  information is out of date and so the offset is wrong.
> However,   info  provided by df_regs_ever_live_p, is updated.
>
> So I think the live info is updated in global.c but not the chains. Is there a sane way around this or should I put this one on "too difficult list"?
>
> best regards
>
There are a three things to consider:

1) Incremental updating is turned off during global.   This was perhaps a mistake, but what I did not want to get into was rescanning each insn that uses/defines a register whenever that register gets assigned.   By turning off rescanning, each insn is only rescanned once, after all of its operands have had their registers assigned.

2) Turning this off is most likely the cause of your grief.   It is possible that you could move the call to turn off scanning until later, after your bit of foolishness happens but before the actual registers are assigned, but the truth is that i really do not really understand the information flow within global/reload so i did not consider something like this.

3) Incremental scanning could be turned back on, but the cost is quite high because most insns have many operands and because reload can change the assignment of registers after global gets finished. 
Kenny 
Comment 9 Eric Weddington 2010-01-29 17:01:59 UTC
Closing as WONTFIX.