Summary: | [avr] optimisation of 8-bit logic sometimes fails | ||
---|---|---|---|
Product: | gcc | Reporter: | David Brown <david> |
Component: | rtl-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | eric.weddington, gcc-bugs, gjl |
Priority: | P3 | Keywords: | missed-optimization |
Version: | 4.2.2 | ||
Target Milestone: | 4.7.0 | ||
Host: | Target: | avr | |
Build: | Known to work: | ||
Known to fail: | 4.6.1 | Last reconfirmed: | 2011-07-09 11:55:16 |
Description
David Brown
2008-01-15 08:56:38 UTC
There are many other cases where 8-bit optimisation is missing - C's integer promotion gets in the way. This is particularly common when dealing with a compile-time constant - there is no way in C to say that "0x3f" is 8-bit rather than a 16-bit int. Another example of code with this problem is: void foo(void) { static unsigned char count; if (++count & 0x3f) { PORTC &= ~0x01; } else { PORTC |= 0x01; } } Both the "&" and the comparison with zero are done as 16-bit. One work-around is to use this macro: #define const_byte(x) ({ static const __attribute__((__progmem__)) \ unsigned char v = x; v; }) Then we can write: #define const_byte(x) ({ static const __attribute__((__progmem__)) \ unsigned char v = x; v; }) uint8_t bar3(uint8_t x, uint8_t y) { return data[y ^ (x & const_byte(0x0f))]; } 147 bar3: 148 /* prologue: function */ 149 /* frame size = 0 */ 150 008c 8F70 andi r24,lo8(15) ; tmp45, 151 008e 8627 eor r24,r22 ; tmp45, y 152 0090 E0E0 ldi r30,lo8(data) ; tmp48, 153 0092 F0E0 ldi r31,hi8(data) ; tmp48, 154 0094 E80F add r30,r24 ; tmp48, tmp45 155 0096 F11D adc r31,__zero_reg__ ; tmp48 156 0098 8081 ld r24,Z ; , data 157 /* epilogue start */ 158 009a 0895 ret 160 As far as I can see, this generated code is optimal. The macro works because it forces the value to be 8-bit, rather than a 16-bit compile-time constant. However, the compiler is still smart enough to see that since it's a "const" with known value, it's value can be used directly. As a side effect, the static "variable" must be created somewhere - by using __progmen__, we create it in flash rather than wasting ram. Even that waste could be spared by garbage-collection linking, or by using a dedicated segment rather than .progmem.data. I still see this in 4.6.1. (In reply to comment #1) > There are many other cases where 8-bit optimisation is missing - C's integer > promotion gets in the way. This is particularly common when dealing with a > compile-time constant - there is no way in C to say that "0x3f" is 8-bit rather > than a 16-bit int. > > Another example of code with this problem is: > > void foo(void) { > static unsigned char count; > > if (++count & 0x3f) { > PORTC &= ~0x01; > } else { > PORTC |= 0x01; > } > } > > Both the "&" and the comparison with zero are done as 16-bit. Please, don't use stuff like PORTC that is nor available to the general GCC developer. They won't be able to reproduce this artifact! This is not specific to the avr part and related to PR43088. Write yout test case like that: typedef unsigned char uint8_t; #define PORTC (*((uint8_t volatile*) 0x28)) void foo(void) { static unsigned char count; if (++count & 0x3f) { PORTC &= ~0x01; } else { PORTC |= 0x01; } } void foo3e(void) { static unsigned char count; if (++count & 0x3e) { PORTC &= ~0x01; } else { PORTC |= 0x01; } } Compiled the following code from commant #0 and commant #1 with avr-gcc 4.7 (SVN 179594) #define uint8_t unsigned char #define PORTC (*((uint8_t volatile*) 0x28)) extern uint8_t data[64]; uint8_t bar(uint8_t x, uint8_t y) { return data[y ^ (x & 0x0f)]; } uint8_t bar2(uint8_t x, uint8_t y) { return data[(y ^ x) & 0x0f]; } void foo(void) { static unsigned char count; if (++count & 0x3f) { PORTC &= ~0x01; } else { PORTC |= 0x01; } } With -Os -dp yields the following result: bar: andi r24,lo8(15) eor r24,r22 mov r30,r24 ldi r31,lo8(0) subi r30,lo8(-(data)) sbci r31,hi8(-(data)) ld r24,Z ret bar2: eor r22,r24 andi r22,lo8(15) mov r30,r22 ldi r31,lo8(0) subi r30,lo8(-(data)) sbci r31,hi8(-(data)) ld r24,Z ret foo: lds r24,count.1232 subi r24,lo8(-(1)) sts count.1232,r24 andi r24,lo8(63) breq .L4 cbi 40-0x20,0 ret .L4: sbi 40-0x20,0 ret Thus, closing this PR as FIXED because the code is optimal and nothing remains to be improved. |