Bug 9567 - Using struct fields produces worse code than stand-alone vars.
Summary: Using struct fields produces worse code than stand-alone vars.
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 3.3
: P3 enhancement
Target Milestone: 4.0.0
Assignee: Not yet assigned to anyone
Keywords: missed-optimization
Depends on:
Reported: 2003-02-04 12:16 UTC by Sergei Organov
Modified: 2004-05-13 11:11 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: powerpc-rtems
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2004-01-20 17:06:04


Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Organov 2003-02-04 12:16:01 UTC
In the code below functions f1() and f2() are equivalent, but
assembly code produced for f1() is worse than those for f2().
The assembly below demonstrates the result for PowerPC, however
similar result could be seen for ix86.

The C/C++ code (note that the code is minimized to demonstrate the problem,
so please ignore using of unitialized variables):

struct A {
  char const* src;
  char* dest;

void f1() {
  A a;
  for(int i = 0; i < 10; ++i)
    *++a.dest = *++a.src;

void f2() {
  char const* src;
  char* dest;
  for(int i = 0; i < 10; ++i)
    *++dest = *++src;

The resulting assembly for PowerPC (note the loop body is
4 vs 2 instructions):

$ ~/try-3.2/tools/bin/ppc-rtems-gcc -c -O4 -save-temps -mregnames struct1.cc -o struct1.o
$ cat struct1.s
	.file	"struct1.cc"
	.section	".text"
	.align 2
	.globl _Z2f1v
	.type	_Z2f1v, @function
	li %r3,10
	mtctr %r3
	li %r7,0
	li %r8,0
	addi %r7,%r7,1
	lbz %r4,0(%r7)
	addi %r8,%r8,1
	stb %r4,0(%r8)
	bdnz .L9
	.size	_Z2f1v, .-_Z2f1v
	.align 2
	.globl _Z2f2v
	.type	_Z2f2v, @function
	li %r3,10
	mtctr %r3
	lbzu %r3,1(%r9)
	stbu %r3,1(%r11)
	bdnz .L18
	.size	_Z2f2v, .-_Z2f2v
	.ident	"GCC: (GNU) 3.3 20030203 (prerelease)"

gcc version 3.3 20030203 (prerelease)

Linux 2.4.20 i686

Compile provided C/C++ code with '-O4 -save-temps' and look at
resulting assembly.
Comment 1 Andrew Pinski 2003-05-26 00:17:57 UTC
still happens on the mainline (20030525):
        li r2,10
        li r11,0
        mtctr r2
        li r12,0
        addi r11,r11,1
        addi r12,r12,1
        lbz r2,0(r11)
        stb r2,0(r12)
        bdnz L8
        .align 2
        .globl __Z2f2v
        li r3,10
        mtctr r3
        lbzu r3,1(r2)
        stbu r3,1(r9)
        bdnz L19
Comment 2 Falk Hueffner 2003-05-26 06:58:14 UTC
Please provide a test case that doesn't have undefined behaviour. gcc
has to do alias analysis to determine whether dest might point to src.
It can't do that properly if you use uninitialized variables.
Comment 3 Dara Hazeghi 2003-06-20 21:41:10 UTC
Just a reminder that this bug is awaiting feedback. Can you provide the type of testcase Falk 
requested? Thanks,

Comment 4 Andrew Pinski 2003-07-27 20:56:25 UTC
I can provide one:
struct A {
  char const* src;
  char* dest;

void f1(char const** src, char** dest) {
  A a;
  a.src= *src;
  a.dest= *dest;
  for(int i = 0; i < 10; ++i)
    *++a.dest = *++a.src;
  *src = a.src;
  *dest= a.dest;

void f2(char const** src1, char** dest1) {
  char const* src = *src1;
  char* dest = *dest1;
  for(int i = 0; i < 10; ++i)
    *++dest = *++src;
  *src1 = src;
  *dest1 = dest;
Comment 5 Andrew Pinski 2003-11-10 08:14:21 UTC
This is another case where GCC likes to put objects on the stack too soon.
Comment 6 Andrew Pinski 2003-12-01 04:18:54 UTC
It is even worse now on the tree-ssa (extra mr's which is really caused by the register allocator).
Comment 7 Andrew Pinski 2004-03-03 06:09:02 UTC
Suspending as this is now fixed on the tree-ssa.
Comment 8 Andrew Pinski 2004-05-13 11:11:08 UTC
Fixed by the merge of the tree-ssa into the mainline.