Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug
Bug#: 33050
Product:  
Component:  
Status: NEW
Resolution:
Assigned To: Not yet assigned to anyone <unassigned@gcc.gnu.org>
Host:
Reported against  
Priority:  
Severity:  
Target Milestone:  
 
 
Target:
Reporter: Wouter van Gulik <wvangulik@xs4all.nl>
Add CC:
CC:
Remove selected CCs
Build:
URL:
Summary:
Keywords:
Known to work:
Known to fail:

Attachment Description Type Created Size Actions
test.c Example text/plain 2007-08-11 18:14 139 bytes Edit
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 33050 depends on: Show dependency tree
Show dependency graph
Bug 33050 blocks:

Additional Comments:





Mark bug as waiting for feedback
Mark bug as suspended




View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Last confirmed: 2007-08-24 20:43 Opened: 2007-08-11 18:13
Using this version/config:

~~~~~~~~~~~~~~~~~`
Using built-in specs.
Target: avr
Configured with: ../gcc-4.1.2/configure --prefix=/c/WinAVR --target=avr
--enable
-languages=c,c++ --with-dwarf2 --enable-win32-registry=WinAVR-20070525
--disable
-nls --with-gmp=/usr/local --with-mpfr=/usr/local --enable-doc --disable-libssp
Thread model: single
gcc version 4.1.2 (WinAVR 20070525)

~~~~~~~~~~~~~~~~~~~~~~~~~~
Using this command line to compile:

avr-gcc -S -Os test.c -mmcu=atmega16

~~~~~~~~~~~~~~~~~~~~~~~~~~~

The test case:

extern unsigned char foo(unsigned char in);
unsigned char test2(unsigned char input) {

  input += foo(0xA); //use input
  foo(0xA);          //make sure input must be saved over the call
  return input;
}


The assembler output:
/* prologue: frame size=0 */
        push r16
        push r17        <<Useless
/* prologue end (size=2) */
        mov r17,r24
        ldi r24,lo8(10)
        call foo
        mov r16,r24     <<Why?? add r17,r24 is much better 
        ldi r24,lo8(10)
        call foo        
        add r17,r16     <<Could be gone if above statement used
        mov r24,r17    
        clr r25
/* epilogue: frame size=0 */
        pop r17
        pop r16         <<Useless
        ret

The adding is delayed until after the last call, but this requires saving an
extra register.

So delaying introduces:
an extra psh/pop
extra mov instruction

------- Comment #1 From Wouter van Gulik 2007-08-11 18:14 -------
Created an attachment (id=14054) [edit]
Example

C source showing non optimal code

------- Comment #2 From Eric Weddington 2007-08-22 17:09 -------
4.3.0 20070817 snapshot generates this for the testcase:

test2:
        push r16
        push r17
/* prologue: function */
/* frame size = 0 */
        mov r16,r24
        ldi r24,lo8(10)
        call foo
        mov r17,r24
        ldi r24,lo8(10)
        call foo
        mov r24,r16
        add r24,r17
/* epilogue start */
        pop r17
        pop r16
        ret

------- Comment #3 From Wouter van Gulik 2007-08-24 19:36 -------
(In reply to comment #2)
> 4.3.0 20070817 snapshot generates this for the testcase:
> 

<snip>

Well at least the extra clr r25 is gone...


I just tried some simpler code:

extern unsigned char foo();
unsigned char test(unsigned char input) {
  return input += foo();
}

The result is:
/* prologue: frame size=0 */
        push r17
/* prologue end (size=1) */
        mov r17,r24
        call foo
        add r17,r24            <<Could do "add r24,r17"
        mov r24,r17            <<This could then be gone
        clr r25                <<This is maybe gone in 4.3.0??
/* epilogue: frame size=0 */
        pop r17
        ret
/* epilogue end (size=2) */

Here the add is also done non-optimal. So maybe solving this prevents the extra
register save?

------- Comment #4 From Eric Weddington 2007-08-24 20:41 -------
(In reply to comment #3)

4.3.0 20070817 snapshot produces this for the second test case:

test:
        push r17
/* prologue: function */
/* frame size = 0 */
        mov r17,r24
        call foo
        add r24,r17
/* epilogue start */
        pop r17
        ret


So the second test case is optimized correctly when we get to 4.3.0.

Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug