Bug 50448 - [4.5/4.6/4.7 Regression] Missed optimization accessing struct component with integer address
Summary: [4.5/4.6/4.7 Regression] Missed optimization accessing struct component with ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.6.1
: P2 normal
Target Milestone: 4.7.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2011-09-18 13:22 UTC by Georg-Johann Lay
Modified: 2011-12-31 11:31 UTC (History)
2 users (show)

See Also:
Host:
Target: avr
Build:
Known to work: 3.4.6
Known to fail: 4.3.3, 4.5.2, 4.6.1
Last reconfirmed: 2011-09-18 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2011-09-18 13:22:59 UTC
typedef struct
{
    unsigned char a,b,c,d;
} SPI_t;

#define SPIE (*(SPI_t volatile*) 0x0AC0)

void foo (void)
{
    SPIE.d = 0xAA;
    while (!(SPIE.c & 0x80));

    SPIE.d = 0xBB;
    while (!(SPIE.c & 0x80));
}

avr-gcc-4.6.1 -Os -S -fdump-tree-optimized -fdump-rtl-expand 
compiles that code to

foo:
	ldi r24,lo8(-86)
	ldi r30,lo8(2752)
	ldi r31,hi8(2752)
	std Z+3,r24
.L2:
	lds r24,2754
	sbrs r24,7
	rjmp .L2
	ldi r24,lo8(-69)
	ldi r30,lo8(2752)
	ldi r31,hi8(2752)
	std Z+3,r24
.L3:
	lds r24,2754
	sbrs r24,7
	rjmp .L3
	ret

Instead of loading the address 2752 two times, it's sufficient to load it once or to do a direct access to 2755 and avoid loading the constant altogether.

The load appeard first in .expand; .optimized looks fine:

foo ()
{
  signed char D.1932;
  volatile unsigned char D.1931;
  signed char D.1930;
  volatile unsigned char D.1929;

<bb 2>:
  MEM[(volatile struct SPI_t *)2752B].d ={v} 170;

<bb 3>:
  D.1929_3 ={v} MEM[(volatile struct SPI_t *)2752B].c;
  D.1930_4 = (signed char) D.1929_3;
  if (D.1930_4 >= 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 4>:
  MEM[(volatile struct SPI_t *)2752B].d ={v} 187;

<bb 5>:
  D.1931_7 ={v} MEM[(volatile struct SPI_t *)2752B].c;
  D.1932_8 = (signed char) D.1931_7;
  if (D.1932_8 >= 0)
    goto <bb 5>;
  else
    goto <bb 6>;

<bb 6>:
  return;
}
Comment 1 Georg-Johann Lay 2011-09-29 15:55:05 UTC
As explained in http://gcc.gnu.org/ml/gcc/2011-09/msg00353.html this looks like a middle-end flaw during tree -> RTL lowering in explow.c:memory_address_addr_space() where the target cannot do anything about.

Changed component from TARGET to MIDDLE-END.
Comment 2 Richard Biener 2011-10-27 10:24:56 UTC
Well, works "fine" on x86_64:

foo:
.LFB0:
        .cfi_startproc
        movb    $-86, 2755
.L2:
        movb    2754, %al
        testb   %al, %al
        jns     .L2
        movl    $2752, %eax
        movb    $-69, 3(%rax)
.L3:
        movb    2754, %al
        testb   %al, %al
        jns     .L3
        ret
Comment 3 Georg-Johann Lay 2011-10-28 12:59:28 UTC
The issue is still present for avr (4.7 trunk r180399).

There is a patch proposed by Paolo that fixes the issue:

Can someone of you integrate that patch? I have no access to compile farm and cannot test for all languages/targets/hosts that might be affected.



Index: cprop.c
===================================================================
--- cprop.c	(revision 177688)
+++ cprop.c	(working copy)
@@ -764,6 +764,18 @@ try_replace_reg (rtx from, rtx to, rtx i
 	note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
     }
 
+  if (set && MEM_P (SET_DEST (set)) && reg_mentioned_p (from, SET_DEST (set)))
+    {
+      /* If above failed and this is a single set, try to simplify the source of
+	 the set given our substitution.  We could perhaps try this for multiple
+	 SETs, but it probably won't buy us anything.  */
+      rtx addr = simplify_replace_rtx (SET_DEST (set), from, to);
+
+      if (!rtx_equal_p (addr, SET_DEST (set))
+	  && validate_change (insn, &SET_DEST (set), addr, 0))
+	success = 1;
+    }
+
   /* REG_EQUAL may get simplified into register.
      We don't allow that. Remove that note. This code ought
      not to happen, because previous code ought to synthesize
Comment 4 Paolo Bonzini 2011-10-28 14:26:57 UTC
Can't you just test on x86_64-linux?
Comment 5 Georg-Johann Lay 2011-11-03 11:01:55 UTC
(In reply to comment #0)

> foo:
>     ldi r24,lo8(-86)
>     ldi r30,lo8(2752)
>     ldi r31,hi8(2752)
>     std Z+3,r24
> .L2:
>     lds r24,2754
>     sbrs r24,7
>     rjmp .L2
>     ldi r24,lo8(-69)
>     ldi r30,lo8(2752)
>     ldi r31,hi8(2752)
>     std Z+3,r24
>     [...]

This is the code generated with Paolo's patch applied:

foo:
	ldi r24,lo8(-86)
	sts 2755,r24
.L2:
	lds r24,2754
	sbrs r24,7
	rjmp .L2
	ldi r24,lo8(-69)
	sts 2755,r24
	[...]
Comment 6 Georg-Johann Lay 2011-11-05 13:08:57 UTC
Author: gjl
Date: Sat Nov  5 13:08:54 2011
New Revision: 181011

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181011
Log:
	PR rtl-optimization/50448
	* cprop.c (try_replace_reg): Also try to replace uses of FROM that
	appear in SET_DEST.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cprop.c
Comment 7 Georg-Johann Lay 2011-11-05 20:37:25 UTC
Fixed in 4.7.0