This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Question on cse_not_expected in explow.c:memory_address_addr_space()

From: Georg-Johann Lay <avr at gjlay dot de>
To: gcc at gcc dot gnu dot org
Date: Wed, 28 Sep 2011 14:14:15 +0200
Subject: Question on cse_not_expected in explow.c:memory_address_addr_space()

Hi, looking into PR50448 there is the following C code:

typedef struct
{
    unsigned char a,b,c,d;
} SPI_t;

#define SPIE (*(SPI_t volatile*) 0x0AC0)

void foo (void)
{
    SPIE.d = 0xAA;
    while (!(SPIE.c & 0x80));

    SPIE.d = 0xBB;
    while (!(SPIE.c & 0x80));
}

At .optimized, the .c and .d struct accesses are direct:

  ...
  D.1985_3 ={v} MEM[(volatile struct SPI_t *)2752B].c;
  D.1986_4 = (signed char) D.1985_3;
  ...

  MEM[(volatile struct SPI_t *)2752B].d ={v} 187;
  ...

Then in explow.c:memory_address_addr_space() there is:

  /* By passing constant addresses through registers
     we get a chance to cse them.  */
  if (! cse_not_expected && CONSTANT_P (x) && CONSTANT_ADDRESS_P (x))
    x = force_reg (address_mode, x);

So that in .expand the code is

...
(insn 5 4 6 3 (set (reg/f:HI 46)
        (const_int 2752 [0xac0])) pr50448.c:10 -1
     (nil))

(insn 6 5 7 3 (set (reg:QI 47)
        (const_int -86 [0xffffffaa])) pr50448.c:10 -1
     (nil))

(insn 7 6 11 3 (set (mem/s/v:QI (plus:HI (reg/f:HI 46)
                (const_int 3 [0x3])) [0 MEM[(volatile struct SPI_t *)2752B].d+0
S1 A8])
        (reg:QI 47)) pr50448.c:10 -1
     (nil))
...

This leads to unpleasant code. The machine can access all RAM locations by
direct addressing. However, the resulting code is:

foo:
	ldi r24,lo8(-86)	 ;  6	*movqi/2	[length = 1]
	ldi r30,lo8(-64)	 ;  34	*movhi/5	[length = 2]
	ldi r31,lo8(10)
	std Z+3,r24	 ;  7	*movqi/3	[length = 1]
.L2:
	lds r24,2754	 ;  10	*movqi/4	[length = 2]
	sbrs r24,7	 ;  43	*sbrx_branchhi	[length = 2]
	rjmp .L2
	ldi r24,lo8(-69)	 ;  16	*movqi/2	[length = 1]
	ldi r30,lo8(-64)	 ;  33	*movhi/5	[length = 2]
	ldi r31,lo8(10)
	std Z+3,r24	 ;  17	*movqi/3	[length = 1]
.L3:
	lds r24,2754	 ;  20	*movqi/4	[length = 2]
	sbrs r24,7	 ;  42	*sbrx_branchhi	[length = 2]
	rjmp .L3
	ret	 ;  39	return	[length = 1]

Insn 34 loads 2752 (0xAC0) to r30/r31 (Z) and does an indirect access (*(Z+3),
i.e. *2755) in insn 7.  The same happens in insn 33 (load 2752) and access
(insn 17).

Is there a way to avoid this? I tried -f[no-]rerun-cse-after-loop but without
effect, same for -Os/-O2 and trying to patch rtx_costs. cse_not_expected is
overridden in some places in the middle-end.

What's the preferred way to avoid such "optimization" in the back-end? At least
for CONST_INT addresses?

AVR has just 2 pointer registers that can do such an access (and only one
besides the frame pointer), so it's not very likely that a register is free and
CSEing too much might lead to spills or data living in the frame causing bulky
load/store from/to stack.

Direct access is faster than loading the constant+indirect access (3 times
slower). And iff not all of the base address 0xAC0 = 2752 gets factored out to
Z+2 resp. Z+3 accesses, the code is bulkier, too.

Johann

Follow-Ups:
- Re: Question on cse_not_expected in explow.c:memory_address_addr_space()
  - From: Paolo Bonzini

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]