This program: extern const int a[]; extern void f(); int foo () { int n = a[0]; f(); n += a[0]; return n; } generates this code: foo: pushl %ebx subl $8, %esp movl a, %ebx call f movl a, %eax addl $8, %esp addl %eax, %ebx movl %ebx, %eax popl %ebx ret Note that the memory 'a' is read twice. 3.2 did this: foo: pushl %ebx subl $8, %esp movl a, %ebx sall $1, %ebx call f addl $8, %esp movl %ebx, %eax popl %ebx ret
Discussion about this is here: http://gcc.gnu.org/ml/gcc/2003-09/msg01125.html
I can confirm this on the mainline (20030925) and in 3.3.2 (20030815). It does not happen in 3.3 (20030304).
The change reported in PR 12419 is somewhat erratic, although a regression hunt on the 3.3 branch identified this patch: --- gcc/gcc/ChangeLog --- 2003-04-07 Glen Nakamura <glen@imodulo.com> PR opt/8634 * explow.c (maybe_set_unchanging): Don't flag non-static const aggregate type initializers with RTX_UNCHANGING_P. which includes the following new comment text: We cannot do this for non-static aggregates, because of the double writes that can be generated by store_constructor, depending on the contents of the initializer. Yes, this does eliminate a good fraction of the number of uses of RTX_UNCHANGING_P for a language like Ada. It also eliminates a good quantity of bugs. Let this be incentive to eliminate RTX_UNCHANGING_P entirely in favour of a more reliable solution, perhaps based on alias sets. The regression hunt on the 3.3-branch took place on i686-pc-linux-gnu using the submitter's test case compiled with -O2. A similar hunt on mainline identified the merge of the 3.4 improvements branch in December; very odd. Even more odd is that I get different results with recent mainline compilers depending on whether I use gcc or cc1 directly. Sounds like something is uninitialized.
The reason why you get different results between using gcc and cc1 now is because the way the default target (i686 vs. i386) is handled now.
Thanks for the analysis. Yes, this makes sense. But it is a nasty pessimization, and we need to get it fixed somehow. I'm interested because it hits method dispatch badly.
This patch http://gcc.gnu.org/ml/gcc-patches/2003-02/msg01655.html was reviewed by rth who suggested an alternative approach: http://gcc.gnu.org/ml/gcc-patches/2003-04/msg00482.html > As for Zdenek's proposed prepass patch... I think it would be > better to not iterate over the entire instruction stream twice. > We could, for instance, change the return value of purge_addressof_1 > to be three-state, with the new return location indicating that > we should re-process this insn at the end. Paolo
The bug is still present on mainline... (cc1 -O2 -fomit-frame-pointer-march=i686 -mtune=i686 ): foo: pushl %ebx subl $8, %esp movl a, %ebx call f movl a, %eax addl $8, %esp addl %eax, %ebx movl %ebx, %eax popl %ebx ret Just for fun, here's what tree-ssa makes of it foo: pushl %ebx subl $8, %esp movl a, %ebx call f addl $8, %esp leal (%ebx,%ebx), %eax popl %ebx ret
pessimizes-code is not a blocker for branching so reducing severity and also it looks like it is also fixed on the tree-ssa.
pessimizes-code is *very* important especially when it's a regression. Our releases are not supposed to generate slower code. Bumping priority up again.
Related to PR opt/13424.
Eric, why do you say this is related to opt/13424? AFAICS the patch there relates only to constructors.
Because PR opt/13424 also relates to RTX_UNCHANGING_P and (fake) double writes.
The severity has been set, increased, and then decreased again. Is this just a matter of different people's opinions, or is there some better reason?
I think it's just a mistake on Andrew's side, since we already discussed that this bug is very important. Bumped priority again.
Patch in http://gcc.gnu.org/ml/gcc-patches/2004-03/msg02251.html, waiting until PR optimization/13424 is resolved.
Should be fixed on gcc-3_4-branch.
Hi, Jakub. What's your state on this?
(In reply to comment #16) > Should be fixed on gcc-3_4-branch. (In reply to comment #16) > Should be fixed on gcc-3_4-branch. Jakub -- Do you plan to do anything for this on gcc-3_3-branch? -- Gaby
I think 3.3 should probably stay as is. RTX_UNCHANGING_P turned out to be a can of bugs, there were many changes in that area in GCC 3.4 (and gcc-3_3-rhl-branch that has basically same code as 3.4 for that) and there is no single safe patch that can be applied to 3.3. Either we backport all RTX_UNCHANGING_P changes in 3.4 and exchange one set of bugs with another one, or keep 3.3 as is. HEAD got rid of RTX_UNCHANGING_P altogether, so there these kinds of bugs aren't lurking anymore.
Won't fix in 3.3.x
Subject: Re: [3.3 Regression] Performance regression: poor optimization of const memory "jakub at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes: | I think 3.3 should probably stay as is. Good! Let's close this bug. -- gaby