Compiling this code with -Os is more than 40 % bigger in size with GCC-4.1 compared to GCC-3.4.3. See also thread: http://gcc.gnu.org/ml/gcc/2005-05/msg00532.html >>>> struct disk_interface_str { unsigned nb_IDE_found; struct IDE_found_str { unsigned short ideIOadr; unsigned short ideIOctrladr; unsigned char irq; unsigned char bios_order; unsigned short reserved; } *IDE_found; } DI; void reorder_IDE_for_linux (void) { static const unsigned short idearray[] = { 0x1F0, 0x170, 0x1E8, 0x168, 0x1E0, 0x160, }; unsigned short cpt, order; for (order = 0; order < sizeof(idearray)/sizeof(idearray[0]); order++) { for (cpt = order + 1; cpt < DI.nb_IDE_found; cpt++) if (DI.IDE_found[cpt].ideIOadr == idearray[order]) break; if (cpt < DI.nb_IDE_found) { struct IDE_found_str save = DI.IDE_found[cpt]; unsigned short i; for (i = order; i < cpt; i++) { struct IDE_found_str tmp = DI.IDE_found[i]; DI.IDE_found[i] = save; save = tmp; } DI.IDE_found[cpt] = save; } } } <<<<
Yada yada yada, you know the drill. SRA, out-of-ssa, and register allocation all working against each other: <L6>:; D.1605 = DI.IDE_found + (struct IDE_found_str *) ((long unsigned int) i * 8); tmp$reserved = D.1605->reserved; tmp$bios_order = D.1605->bios_order; tmp$irq = D.1605->irq; tmp$ideIOctrladr = D.1605->ideIOctrladr; tmp$ideIOadr = D.1605->ideIOadr; D.1605->reserved = save$reserved; D.1605->bios_order = save$bios_order; D.1605->irq = save$irq; D.1605->ideIOctrladr = save$ideIOctrladr; D.1605->ideIOadr = save$ideIOadr; i = i + 1; save$reserved = tmp$reserved; save$bios_order = tmp$bios_order; save$irq = tmp$irq; save$ideIOctrladr = tmp$ideIOctrladr; save$ideIOadr = tmp$ideIOadr; Wouldn't a block move be more efficient here than moving things one-by-one?
Maybe SRA could be tuned differently for -Os. RTH, do you think it is feasable, or is it only a register allocator problem and should not be handled at the tree level at all?
Created attachment 8937 [details] Another testcase showing ENORMOUS regression Another testcase showing a very big regression at -Os related to SRA: $ ./xgcc -c -Os -B. btst.c && size btst.o text data bss dec hex filename 5339 0 0 5339 14db btst.o $ ./xgcc -c -Os -fno-tree-sra -B. btst.c && size btst.o text data bss dec hex filename 224 0 0 224 e0 btst.o (GCC 3.3 generates a text section of 261 bytes).
*** Bug 21680 has been marked as a duplicate of this bug. ***
Notice that both testcases come from the same program (Gujin).
Subject: Bug 21529 CVSROOT: /cvs/gcc Module name: gcc Changes by: rth@gcc.gnu.org 2005-08-05 02:42:07 Modified files: gcc : ChangeLog params.def params.h tree-sra.c Log message: PR 21529 * params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New. * params.h (SRA_MAX_STRUCTURE_COUNT): New. * tree-sra.c (decide_block_copy): Use it. Disable element copy if we'd have to instantiate too many members. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9660&r2=2.9661 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.def.diff?cvsroot=gcc&r1=1.65&r2=1.66 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.h.diff?cvsroot=gcc&r1=1.31&r2=1.32 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-sra.c.diff?cvsroot=gcc&r1=2.69&r2=2.70
Subject: Bug 21529 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-4_0-branch Changes by: rth@gcc.gnu.org 2005-08-05 20:39:05 Modified files: gcc : ChangeLog params.def params.h tree-sra.c Log message: PR 21529 * params.def (PARAM_SRA_MAX_STRUCTURE_COUNT): New. * params.h (SRA_MAX_STRUCTURE_COUNT): New. * tree-sra.c (decide_block_copy): Use it. Disable element copy if we'd have to instantiate too many members. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.353&r2=2.7592.2.354 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.def.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.54.6.2&r2=1.54.6.3 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/params.h.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.28&r2=1.28.8.1 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-sra.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.53.2.2&r2=2.53.2.3
Fixed.