Bug 21529 - [4.0/4.1 Regression] code size regression (+40%) with -Os from GCC-3.4.3 to 4.1
Summary: [4.0/4.1 Regression] code size regression (+40%) with -Os from GCC-3.4.3 to 4.1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.1.0
: P2 normal
Target Milestone: 4.0.2
Assignee: Richard Henderson
URL:
Keywords: missed-optimization, ra
: 21680 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-05-12 09:41 UTC by etienne_lorrain
Modified: 2005-08-05 20:39 UTC (History)
3 users (show)

See Also:
Host: i486-pc-linux-gnu
Target: i486-pc-linux-gnu
Build: i486-pc-linux-gnu
Known to work: 3.4.3
Known to fail: 4.0.0 4.1.0
Last reconfirmed: 2005-08-04 22:54:19


Attachments
Another testcase showing ENORMOUS regression (1.28 KB, text/plain)
2005-05-20 15:48 UTC, Giovanni Bajo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description etienne_lorrain 2005-05-12 09:41:59 UTC
Compiling this code with -Os is more than 40 % bigger in size with GCC-4.1
compared to GCC-3.4.3.
See also thread: http://gcc.gnu.org/ml/gcc/2005-05/msg00532.html
>>>>
struct disk_interface_str {
    unsigned    nb_IDE_found;
    struct IDE_found_str {
        unsigned short  ideIOadr;
        unsigned short  ideIOctrladr;
        unsigned char   irq;
        unsigned char   bios_order;
        unsigned short  reserved;
        } *IDE_found;
    } DI;

void reorder_IDE_for_linux (void)
  {
  static const unsigned short idearray[] = {
        0x1F0, 0x170,
        0x1E8, 0x168,
        0x1E0, 0x160,
        };
  unsigned short cpt, order;

  for (order = 0; order < sizeof(idearray)/sizeof(idearray[0]); order++) {
      for (cpt = order + 1; cpt < DI.nb_IDE_found; cpt++)
          if (DI.IDE_found[cpt].ideIOadr == idearray[order])
              break;
      if (cpt < DI.nb_IDE_found) {
          struct IDE_found_str save = DI.IDE_found[cpt];
          unsigned short i;

          for (i = order; i < cpt; i++) {
              struct IDE_found_str tmp = DI.IDE_found[i];
              DI.IDE_found[i] = save;
              save = tmp;
              }
          DI.IDE_found[cpt] = save;
          }
      }
  }
<<<<
Comment 1 Steven Bosscher 2005-05-12 10:39:02 UTC
Yada yada yada, you know the drill.  SRA, out-of-ssa, and register 
allocation all working against each other: 
 
<L6>:; 
  D.1605 = DI.IDE_found + (struct IDE_found_str *) ((long unsigned int) i * 
8); 
  tmp$reserved = D.1605->reserved; 
  tmp$bios_order = D.1605->bios_order; 
  tmp$irq = D.1605->irq; 
  tmp$ideIOctrladr = D.1605->ideIOctrladr; 
  tmp$ideIOadr = D.1605->ideIOadr; 
  D.1605->reserved = save$reserved; 
  D.1605->bios_order = save$bios_order; 
  D.1605->irq = save$irq; 
  D.1605->ideIOctrladr = save$ideIOctrladr; 
  D.1605->ideIOadr = save$ideIOadr; 
  i = i + 1; 
  save$reserved = tmp$reserved; 
  save$bios_order = tmp$bios_order; 
  save$irq = tmp$irq; 
  save$ideIOctrladr = tmp$ideIOctrladr; 
  save$ideIOadr = tmp$ideIOadr; 
 
Wouldn't a block move be more efficient here than moving things one-by-one? 
 
Comment 2 Giovanni Bajo 2005-05-12 11:48:20 UTC
Maybe SRA could be tuned differently for -Os. RTH, do you think it is feasable, 
or is it only a register allocator problem and should not be handled at the 
tree level at all?
Comment 3 Giovanni Bajo 2005-05-20 15:48:57 UTC
Created attachment 8937 [details]
Another testcase showing ENORMOUS regression

Another testcase showing a very big regression at -Os related to SRA:

$ ./xgcc -c -Os -B. btst.c && size btst.o
   text    data     bss     dec     hex filename
   5339       0       0    5339    14db btst.o
$ ./xgcc -c -Os -fno-tree-sra -B. btst.c && size btst.o
   text    data     bss     dec     hex filename
    224       0       0     224      e0 btst.o

(GCC 3.3 generates a text section of 261 bytes).
Comment 4 Giovanni Bajo 2005-05-20 15:49:30 UTC
*** Bug 21680 has been marked as a duplicate of this bug. ***
Comment 5 Giovanni Bajo 2005-05-20 15:50:34 UTC
Notice that both testcases come from the same program (Gujin).
Comment 6 GCC Commits 2005-08-05 02:42:19 UTC
Subject: Bug 21529

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-08-05 02:42:07

Modified files:
	gcc            : ChangeLog params.def params.h tree-sra.c 

Log message:
	PR 21529
	* params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New.
	* params.h (SRA_MAX_STRUCTURE_COUNT): New.
	* tree-sra.c (decide_block_copy): Use it.  Disable element copy
	if we'd have to instantiate too many members.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9660&r2=2.9661
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.def.diff?cvsroot=gcc&r1=1.65&r2=1.66
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.h.diff?cvsroot=gcc&r1=1.31&r2=1.32
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-sra.c.diff?cvsroot=gcc&r1=2.69&r2=2.70

Comment 7 GCC Commits 2005-08-05 20:39:16 UTC
Subject: Bug 21529

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-4_0-branch
Changes by:	rth@gcc.gnu.org	2005-08-05 20:39:05

Modified files:
	gcc            : ChangeLog params.def params.h tree-sra.c 

Log message:
	PR 21529
	* params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New.
	* params.h (SRA_MAX_STRUCTURE_COUNT): New.
	* tree-sra.c (decide_block_copy): Use it.  Disable element copy
	if we'd have to instantiate too many members.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.353&r2=2.7592.2.354
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.def.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.54.6.2&r2=1.54.6.3
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.h.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.28&r2=1.28.8.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-sra.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.53.2.2&r2=2.53.2.3

Comment 8 Richard Henderson 2005-08-05 20:39:46 UTC
Fixed.