Optimizing 32 bits integer manipulation on 8 bit AVR target
Sylvain Leroux
sylvain@chicoree.fr
Thu Aug 30 18:05:00 GMT 2012
As a complement to my previous message,
It appears the following C source leads to much better code:
---------8<-----------------------------------
void f(uint32_t i) {
union __attribute__((__packed__)) {
uint32_t i;
struct S { uint8_t a,b,c,d; } s;
} u;
u.i = i | ((uint32_t)(0xFF) << 16);
DDRA = (uint32_t)(u.s.d);
DDRA = (uint32_t)(u.s.c);
DDRA = (uint32_t)(u.s.b);
DDRA = (uint32_t)(u.s.a);
}
---------8<-----------------------------------
avr-objdump:
---------8<-----------------------------------
union __attribute__((__packed__)) {
uint32_t i;
struct S { uint8_t a,b,c,d; } s;
} u;
u.i = i | ((uint32_t)(0xFF) << 16);
58: af 6f ori r26, 0xFF ; 255
DDRA = (uint32_t)(u.s.d);
5a: ba bb out 0x1a, r27 ; 26
DDRA = (uint32_t)(u.s.c);
5c: aa bb out 0x1a, r26 ; 26
DDRA = (uint32_t)(u.s.b);
5e: 9a bb out 0x1a, r25 ; 26
DDRA = (uint32_t)(u.s.a);
60: 8a bb out 0x1a, r24 ; 26
---------8<-----------------------------------
*But*, the C code is no longer portable since I'm using
"__attribute__((__packed__))". Moreover it requires endianness
knowledge/assumption.
That's why I was hoping for a command line option allowing gcc to
perform the same optimization.
- Sylvain
On 08/30/2012 02:54 PM, Sylvain Leroux wrote:
> Hi,
>
> It seems to me that avr-gcc/avr-g++ is producing sub-optimal code for
> the 'f' function in the following source code:
>
> ---------8<-----------------------------------
> #include <avr/io.h>
>
> void f(uint32_t i) {
> i |= ((uint32_t)(0xFF) << 16);
>
> /* DDRA is an 8 bit register */
> DDRA = (uint32_t)(i);
> DDRA = (uint32_t)(i>>8);
> DDRA = (uint32_t)(i>>16);
> DDRA = (uint32_t)(i>>24);
> }
>
> int main() {
> volatile uint32_t n = 0x01020304;
>
> f(n);
> }
> ---------8<-----------------------------------
> Having compiled with the following options:
> avr-gcc c.c -mmcu=attiny2313
> -Os -ffunction-sections -fdata-sections
> -g -Wl,--gc-sections -Wl,--print-gc-sections
> -fipa-cp -fcprop-registers -fweb
>
> ... here is the relevant fragment as displayed by avr-objdump. I marked
> with a star (*) all the instruction that appears to be useless:
> ---------8<-----------------------------------
> void f(uint32_t i) {
> i |= ((uint32_t)(0xFF) << 16);
> 34: 8f 6f ori r24, 0xFF ; 255
>
> DDRA = (uint32_t)(i);
> 36: 6a bb out 0x1a, r22 ; 26
> DDRA = (uint32_t)(i>>8);
> 38: 27 2f mov r18, r23
> * 3a: 38 2f mov r19, r24
> * 3c: 49 2f mov r20, r25
> * 3e: 55 27 eor r21, r21
> 40: 2a bb out 0x1a, r18 ; 26
> DDRA = (uint32_t)(i>>16);
> 42: 9c 01 movw r18, r24
> * 44: 44 27 eor r20, r20
> * 46: 55 27 eor r21, r21
> 48: 2a bb out 0x1a, r18 ; 26
> DDRA = (uint32_t)(i>>24);
> 4a: 69 2f mov r22, r25
> * 4c: 77 27 eor r23, r23
> * 4e: 88 27 eor r24, r24
> * 50: 99 27 eor r25, r25
> 52: 6a bb out 0x1a, r22 ; 26
> }
> 54: 08 95 ret
> ---------8<-----------------------------------
>
> Both gcc and g++ produce the same code. And I get the same results both
> with 4.3.5 and 4.7.1
>
> Here is my question:
> Is there any option(s) that will help gcc to not produce those extra
> instructions in such case?
>
>
> Regards,
> - Sylvain
>
>
>
--
-- Sylvain Leroux
-- sylvain@chicoree.fr
-- http://www.chicoree.fr
More information about the Gcc-help
mailing list