This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Question on cse_not_expected in explow.c:memory_address_addr_space()
- From: Georg-Johann Lay <avr at gjlay dot de>
- To: gcc at gcc dot gnu dot org
- Date: Wed, 28 Sep 2011 14:14:15 +0200
- Subject: Question on cse_not_expected in explow.c:memory_address_addr_space()
Hi, looking into PR50448 there is the following C code:
typedef struct
{
unsigned char a,b,c,d;
} SPI_t;
#define SPIE (*(SPI_t volatile*) 0x0AC0)
void foo (void)
{
SPIE.d = 0xAA;
while (!(SPIE.c & 0x80));
SPIE.d = 0xBB;
while (!(SPIE.c & 0x80));
}
At .optimized, the .c and .d struct accesses are direct:
...
D.1985_3 ={v} MEM[(volatile struct SPI_t *)2752B].c;
D.1986_4 = (signed char) D.1985_3;
...
MEM[(volatile struct SPI_t *)2752B].d ={v} 187;
...
Then in explow.c:memory_address_addr_space() there is:
/* By passing constant addresses through registers
we get a chance to cse them. */
if (! cse_not_expected && CONSTANT_P (x) && CONSTANT_ADDRESS_P (x))
x = force_reg (address_mode, x);
So that in .expand the code is
...
(insn 5 4 6 3 (set (reg/f:HI 46)
(const_int 2752 [0xac0])) pr50448.c:10 -1
(nil))
(insn 6 5 7 3 (set (reg:QI 47)
(const_int -86 [0xffffffaa])) pr50448.c:10 -1
(nil))
(insn 7 6 11 3 (set (mem/s/v:QI (plus:HI (reg/f:HI 46)
(const_int 3 [0x3])) [0 MEM[(volatile struct SPI_t *)2752B].d+0
S1 A8])
(reg:QI 47)) pr50448.c:10 -1
(nil))
...
This leads to unpleasant code. The machine can access all RAM locations by
direct addressing. However, the resulting code is:
foo:
ldi r24,lo8(-86) ; 6 *movqi/2 [length = 1]
ldi r30,lo8(-64) ; 34 *movhi/5 [length = 2]
ldi r31,lo8(10)
std Z+3,r24 ; 7 *movqi/3 [length = 1]
.L2:
lds r24,2754 ; 10 *movqi/4 [length = 2]
sbrs r24,7 ; 43 *sbrx_branchhi [length = 2]
rjmp .L2
ldi r24,lo8(-69) ; 16 *movqi/2 [length = 1]
ldi r30,lo8(-64) ; 33 *movhi/5 [length = 2]
ldi r31,lo8(10)
std Z+3,r24 ; 17 *movqi/3 [length = 1]
.L3:
lds r24,2754 ; 20 *movqi/4 [length = 2]
sbrs r24,7 ; 42 *sbrx_branchhi [length = 2]
rjmp .L3
ret ; 39 return [length = 1]
Insn 34 loads 2752 (0xAC0) to r30/r31 (Z) and does an indirect access (*(Z+3),
i.e. *2755) in insn 7. The same happens in insn 33 (load 2752) and access
(insn 17).
Is there a way to avoid this? I tried -f[no-]rerun-cse-after-loop but without
effect, same for -Os/-O2 and trying to patch rtx_costs. cse_not_expected is
overridden in some places in the middle-end.
What's the preferred way to avoid such "optimization" in the back-end? At least
for CONST_INT addresses?
AVR has just 2 pointer registers that can do such an access (and only one
besides the frame pointer), so it's not very likely that a register is free and
CSEing too much might lead to spills or data living in the frame causing bulky
load/store from/to stack.
Direct access is faster than loading the constant+indirect access (3 times
slower). And iff not all of the base address 0xAC0 = 2752 gets factored out to
Z+2 resp. Z+3 accesses, the code is bulkier, too.
Johann