12419 – [3.3 Regression] Performance regression: poor optimization of const memory

Bug 12419 - [3.3 Regression] Performance regression: poor optimization of const memory

Summary: [3.3 Regression] Performance regression: poor optimization of const memory

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	3.4.0

Importance:	P2 critical
Target Milestone:	3.4.0
Assignee:	Jakub Jelinek

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2003-09-26 10:55 UTC by Andrew Haley
Modified:	2004-09-28 13:55 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Known to work:	3.2 tree-ssa 3.2.3 3.4.0 4.0.0
Known to fail:	3.3.1
Last reconfirmed:	2004-03-02 02:22:53

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Haley 2003-09-26 10:55:02 UTC

This program:

extern const int a[];
extern void f();

int foo ()
{
  int n = a[0]; 
  f();
  n += a[0];
  return n;
}

generates this code:

foo:
        pushl   %ebx
        subl    $8, %esp
        movl    a, %ebx
        call    f
        movl    a, %eax
        addl    $8, %esp
        addl    %eax, %ebx
        movl    %ebx, %eax
        popl    %ebx
        ret

Note that the memory 'a' is read twice.  3.2 did this:

foo:
        pushl   %ebx
        subl    $8, %esp
        movl    a, %ebx
        sall    $1, %ebx
        call    f
        addl    $8, %esp
        movl    %ebx, %eax
        popl    %ebx
        ret

Comment 1 Wolfgang Bangerth 2003-09-26 18:03:06 UTC

Discussion about this is here:
  http://gcc.gnu.org/ml/gcc/2003-09/msg01125.html

Comment 2 Andrew Pinski 2003-09-27 06:36:21 UTC

I can confirm this on the mainline (20030925) and in 3.3.2 (20030815).  It does not 
happen in 3.3 (20030304).

Comment 3 janis187 2003-09-29 22:42:11 UTC

The change reported in PR 12419 is somewhat erratic, although a
regression hunt on the 3.3 branch identified this patch:

--- gcc/gcc/ChangeLog ---

2003-04-07  Glen Nakamura  <glen@imodulo.com>

          PR opt/8634
          * explow.c (maybe_set_unchanging): Don't flag non-static const
          aggregate type initializers with RTX_UNCHANGING_P.

which includes the following new comment text:

       We cannot do this for non-static aggregates, because of the double
       writes that can be generated by store_constructor, depending on the
       contents of the initializer.  Yes, this does eliminate a good fraction
       of the number of uses of RTX_UNCHANGING_P for a language like Ada.
       It also eliminates a good quantity of bugs.  Let this be incentive to
       eliminate RTX_UNCHANGING_P entirely in favour of a more reliable
       solution, perhaps based on alias sets.

The regression hunt on the 3.3-branch took place on i686-pc-linux-gnu
using the submitter's test case compiled with -O2.

A similar hunt on mainline identified the merge of the 3.4 improvements
branch in December; very odd.  Even more odd is that I get different
results with recent mainline compilers depending on whether I use gcc or
cc1 directly.  Sounds like something is uninitialized.

Comment 4 Andrew Pinski 2003-09-29 22:51:09 UTC

The reason why you get different results between using gcc and cc1 now is because the way the 
default target (i686 vs. i386) is handled now.

Comment 5 Andrew Haley 2003-09-30 18:23:02 UTC

Thanks for the analysis.  Yes, this makes sense.  But it is a nasty
pessimization, and we need to get it fixed somehow.  I'm interested because it
hits method dispatch badly.

Comment 6 Paolo Bonzini 2003-10-06 08:01:56 UTC

This patch

http://gcc.gnu.org/ml/gcc-patches/2003-02/msg01655.html

was reviewed by rth who suggested an alternative approach:

http://gcc.gnu.org/ml/gcc-patches/2003-04/msg00482.html

> As for Zdenek's proposed prepass patch... I think it would be
> better to not iterate over the entire instruction stream twice.
> We could, for instance, change the return value of purge_addressof_1
> to be three-state, with the new return location indicating that
> we should re-process this insn at the end.


Paolo

Comment 7 Steven Bosscher 2003-12-18 17:29:55 UTC

The bug is still present on mainline... 
(cc1 -O2 -fomit-frame-pointer-march=i686 -mtune=i686 ): 
 
foo: 
        pushl   %ebx 
        subl    $8, %esp 
        movl    a, %ebx 
        call    f 
        movl    a, %eax 
        addl    $8, %esp 
        addl    %eax, %ebx 
        movl    %ebx, %eax 
        popl    %ebx 
        ret 
 
 
Just for fun, here's what tree-ssa makes of it 
 
foo: 
        pushl   %ebx 
        subl    $8, %esp 
        movl    a, %ebx 
        call    f 
        addl    $8, %esp 
        leal    (%ebx,%ebx), %eax 
        popl    %ebx 
        ret

Comment 8 Andrew Pinski 2004-01-04 09:11:17 UTC

pessimizes-code is not a blocker for branching so reducing severity and also it looks like 
it is also fixed on the tree-ssa.

Comment 9 Giovanni Bajo 2004-01-12 15:09:22 UTC

pessimizes-code is *very* important especially when it's a regression. Our 
releases are not supposed to generate slower code. Bumping priority up again.

Comment 10 Eric Botcazou 2004-01-13 11:46:36 UTC

Related to PR opt/13424.

Comment 11 Andrew Haley 2004-01-16 17:33:17 UTC

Eric, why do you say this is related to opt/13424?

AFAICS the patch there relates only to constructors.

Comment 12 Eric Botcazou 2004-01-17 08:51:36 UTC

Because PR opt/13424 also relates to RTX_UNCHANGING_P and (fake) double writes.

Comment 13 Andrew Haley 2004-03-02 10:03:27 UTC

The severity has been set, increased, and then decreased again.  Is
this just a matter of different people's opinions, or is there some
better reason?

Comment 14 Giovanni Bajo 2004-03-02 12:16:12 UTC

I think it's just a mistake on Andrew's side, since we already discussed that 
this bug is very important. Bumped priority again.

Comment 15 Jakub Jelinek 2004-04-01 17:02:05 UTC

Patch in http://gcc.gnu.org/ml/gcc-patches/2004-03/msg02251.html, waiting
until PR optimization/13424 is resolved.

Comment 16 Jakub Jelinek 2004-04-05 15:35:59 UTC

Should be fixed on gcc-3_4-branch.

Comment 17 Andrew Haley 2004-09-06 14:16:18 UTC

Hi, Jakub.  What's your state on this?

Comment 18 Gabriel Dos Reis 2004-09-28 13:23:15 UTC

(In reply to comment #16)
> Should be fixed on gcc-3_4-branch.

(In reply to comment #16)
> Should be fixed on gcc-3_4-branch.

Jakub --
  Do you plan to do anything for this on gcc-3_3-branch?

-- Gaby

Comment 19 Jakub Jelinek 2004-09-28 13:31:27 UTC

I think 3.3 should probably stay as is.
RTX_UNCHANGING_P turned out to be a can of bugs, there were many changes in that
area in GCC 3.4 (and gcc-3_3-rhl-branch that has basically same code as 3.4 for
that) and there is no single safe patch that can be applied to 3.3.
Either we backport all RTX_UNCHANGING_P changes in 3.4 and exchange one set of
bugs with another one, or keep 3.3 as is.
HEAD got rid of RTX_UNCHANGING_P altogether, so there these kinds of bugs aren't
lurking anymore.

Comment 20 Gabriel Dos Reis 2004-09-28 13:55:27 UTC

Won't fix in 3.3.x

Comment 21 gdr@cs.tamu.edu 2004-09-28 13:56:03 UTC

Subject: Re:  [3.3 Regression] Performance regression: poor optimization of const memory

"jakub at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| I think 3.3 should probably stay as is.

Good!  Let's close this bug.

-- gaby