GCC Bugzilla has been upgraded from version 4.4.9 to 5.0rc3. If you see any problem, please report it to bug 64968.
Bug 20367 - alias analysis doesn't take into account that variables that haven't their address taken can't alias arbitrary MEMs
Summary: alias analysis doesn't take into account that variables that haven't their ad...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.2
: P2 normal
Target Milestone: 4.6.0
Assignee: Not yet assigned to anyone
URL:
Keywords: alias
Depends on: 19905
Blocks: 29842
  Show dependency treegraph
 
Reported: 2005-03-07 19:47 UTC by Jorn Wolfgang Rennecke
Modified: 2012-01-10 15:31 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-11-02 01:38:20


Attachments
patch against 3.4 (2.49 KB, patch)
2005-03-07 20:01 UTC, Jorn Wolfgang Rennecke
Details | Diff
scheduling testcase (119 bytes, text/plain)
2005-03-08 20:34 UTC, Jorn Wolfgang Rennecke
Details
old output (321 bytes, text/plain)
2005-03-08 20:37 UTC, Jorn Wolfgang Rennecke
Details
output of patched compiler (322 bytes, text/plain)
2005-03-08 20:39 UTC, Jorn Wolfgang Rennecke
Details
patch against 4.1 (2.46 KB, patch)
2005-03-08 20:42 UTC, Jorn Wolfgang Rennecke
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jorn Wolfgang Rennecke 2005-03-07 19:47:07 UTC
This has been discussed here:

http://gcc.gnu.org/ml/gcc-patches/2004-01/msg03141.html
Comment 1 Jorn Wolfgang Rennecke 2005-03-07 20:01:27 UTC
Created attachment 8353 [details]
patch against 3.4

The previous patch had the problem that the non-existence of a MEM_EXPR was
taken
to mean that no static variable is involved.  This updated patch uses a
MEM_EXPR with the new special node unspecified_indirect_ref_node to indicate
that no
static variable is involved.  Thus, when two MEMs are merged and thus loose
their MEM_EXPR, the optimization should still be safe.
Comment 2 Jorn Wolfgang Rennecke 2005-03-07 20:27:04 UTC
Here is an example out of the google cache:
http://66.102.9.104/search?q=cache:BGnyHfyQlSYJ:gcc.gnu.org/ml/gcc-patches/2004-02/msg01638.html+unspecified_indirect_ref_node&hl=en

RFA: alias analysis for non-addressed variables vs. INDIRECT_REF

    * From: Joern Rennecke <joern dot rennecke at superh dot com>
    * To: gcc-patches at gcc dot gnu dot org
    * Date: Wed, 18 Feb 2004 18:24:58 +0000 (GMT)
    * Subject: RFA: alias analysis for non-addressed variables vs. INDIRECT_REF

As discussed before, we are currently missing opportunities to
hoist out accesses of non-addressed variables.  I have now
modified my patch so that a special node unspecified_indirect_ref_node
is used in MEM_EXPRs to mark memory accesses that are generated
from an otherwise unspecified INDIRECT_REF.

Consider this test case:

int
f(int s, int *a)
{
  static int i;
  for (i = 0; i < 800; i++)
    s += a[i];
  return s;
}

For the inner loop, this is the code that I get for sh-elf at -O2 -m4
with mainline from 2004-02-17 without my patch:

.L5:
        mov.l   @r7,r1
        mov     r1,r0
        shll2   r0
        mov.l   @(r0,r5),r2
        add     #1,r1
        cmp/gt  r6,r1
        mov.l   r1,@r7
        bf/s    .L5
        add     r2,r3

And this with the patch; the loads and stores if the loop induction
variable 'i' (@r7) have been hoisted out, and the a[i] giv has been
strength reduced from @(r0,r5) to @r5+.

.L5:
        mov.l   @r5+,r1
        add     #1,r2
        cmp/gt  r3,r2
        bf/s    .L5
        add     r1,r0

Comment 3 Giovanni Bajo 2005-03-08 00:27:06 UTC
I reckon this is already fixed by tree-ssa, or we'll be fixed by the incoming 
TARGET_MEM_REF work. Zdenek?
Comment 4 Andrew Pinski 2005-03-08 00:31:51 UTC
(In reply to comment #3)
> I reckon this is already fixed by tree-ssa, or we'll be fixed by the incoming 
> TARGET_MEM_REF work. Zdenek?

It is fixed via neither of the above but is fixed on the tree-profiling branch via the IPA statics stuff which 
Kenney Zadeck is working on.

This is also a related to PR 19905, it will most likely be solved by the same patch also.
Comment 5 Andrew Pinski 2005-03-08 01:21:02 UTC
The testcase given above is already optimizated on the mainline via some of the aliasing code on the 
tree level but still needs to be more, witness 19905 or even the following testcase which is basically 
19905 still:
void g();
int
f(int s, int *a)
{
  static int i;
  for (i = 0; i < 800; i++)
  {
    g();
    s += a[i];
  }
  return s;
}

But all of this needs to be on the tree level to be really effective.
Comment 6 Diego Novillo 2005-03-08 01:36:44 UTC
Subject: Re:  alias analysis doesn't take into
 account that variables that haven't their address taken can't alias arbitrary
 MEMs

pinskia at gcc dot gnu dot org wrote:

> void g();
> int
> f(int s, int *a)
> {
>   static int i;
>   for (i = 0; i < 800; i++)
>   {
>     g();
>     s += a[i];
>   }
>   return s;
> }
> 
> But all of this needs to be on the tree level to be really effective.
> 
This particular case is trivial to fix inside the tree optimizers. 
Variable 'i' is a local variable with static storage that is not upward 
exposed (i.e., it has no default definition).

Once you recognize that, you can safely flip the TREE_STATIC bit on the 
variable and expose it as a gimple register.


Diego.
Comment 7 Jorn Wolfgang Rennecke 2005-03-08 05:08:49 UTC
(In reply to comment #5)
> The testcase given above is already optimizated on the mainline via some of
the aliasing code on the 
> tree level but still needs to be more, witness 19905 or even the following
testcase which is basically 
> 19905 still:
> void g();
> int
> f(int s, int *a)
> {
>   static int i;
>   for (i = 0; i < 800; i++)
>   {
>     g();
>     s += a[i];
>   }
>   return s;
> }
> 
> But all of this needs to be on the tree level to be really effective.

You can't optimize this because g is not pure.  g might call f.
But I had some real-life code that did allow the optimization inside a loop,
but prevented completely changing the static variable because of non-pure
function callsabove and below the loop.  IIRC the static variable was not
initialized right before the loop, but earlier in the function, so if you
really had some recursion, fun things would happen.  As a matter of fact,
an early version of my patch failed to take recursion into account, and we
found a few months later that rm got miscompiled because it uses indirect
recusion and static variables in a rather interesting way.
Comment 8 Giovanni Bajo 2005-03-08 12:30:44 UTC
So, to recap: testcase in comment #5 should not be optimized (at least, it is 
not related to this bug). Testcase in comment #2 is already optimized correctly 
in the tree-profiling-branch, which is due to be merged for 4.1.

If so, we can suspend this bug until the branch is merged. The proposed patch 
cannot be merged to release branches because this is not a regression, AFAICT. 

I also guess that on the mainline we want to fix this at the tree level.
Comment 9 Andrew Pinski 2005-03-08 14:07:36 UTC
(In reply to comment #8)
> So, to recap: testcase in comment #5 should not be optimized (at least, it is 
> not related to this bug). Testcase in comment #2 is already optimized correctly 
> in the tree-profiling-branch, which is due to be merged for 4.1.

Nope the testcase in comment #2 is already optimizated on the mainline (and in 4.0.0) so this could be 
closed as fixed.  Now the one in Comment #5 is a dup of bug 19905 so this could be closed as fixed for 
4.0.0 and not worried about.
Comment 10 joern.rennecke@st.com 2005-03-08 14:21:56 UTC
Subject: Re:  alias analysis doesn't take into account that variables that haven't their address taken can't alias arbitrary MEMs

giovannibajo at libero dot it wrote:

>------- Additional Comments From giovannibajo at libero dot it  2005-03-08 12:30 -------
>So, to recap: testcase in comment #5 should not be optimized (at least, it is 
>not related to this bug).
>
Actually, it is related inasmuch as it demonstrates a pitfall you have 
to avoid.

> Testcase in comment #2 is already optimized correctly 
>in the tree-profiling-branch, which is due to be merged for 4.1.
>
>If so, we can suspend this bug until the branch is merged. The proposed patch 
>cannot be merged to release branches because this is not a regression, AFAICT. 
>
>I also guess that on the mainline we want to fix this at the tree level.
>
It still makes sense to also handle this at the rtl level, so that the 
scheduler knows that
the MEMs don't alias.  Unless you propose to convert sched1 and sched2 to do
machine-independent scheduling on trees ;-)

Comment 11 Jorn Wolfgang Rennecke 2005-03-08 15:10:30 UTC
You have not addressed the scheduling issues.
Comment 12 Giovanni Bajo 2005-03-08 18:18:23 UTC
(In reply to comment #10)


> >So, to recap: testcase in comment #5 should not be optimized (at least,
> >it is 
> >not related to this bug).

> Actually, it is related inasmuch as it demonstrates a pitfall you have 
> to avoid

Right. We then should prepare a testcase from this code that scans the ivopts 
dump to check that the IV is not strength reduced or something like that.

> It still makes sense to also handle this at the rtl level, so that the 
> scheduler knows that
> the MEMs don't alias.  Unless you propose to convert sched1 and sched2 to do
> machine-independent scheduling on trees ;-)

To tell you the truth, I would like the expanders to somehow preserve tree 
aliasing information while creating RTL. But this is gonna be post 4.1 anyway 
since nobody is working on it, so yes, you are right, for now we want to handle 
this at RTL level too.

Can you show us the SH code generated by mainline for the testcase in comment 
#2? Or otherwise provide another testcase where the scheduling conflict is 
visible?
Comment 13 Jorn Wolfgang Rennecke 2005-03-08 20:34:39 UTC
Created attachment 8363 [details]
scheduling testcase
Comment 14 Jorn Wolfgang Rennecke 2005-03-08 20:37:21 UTC
Created attachment 8364 [details]
old output

This is the output of the unpatched gcc 4.1.0 20050307 for sh-elf -O2
-m4-single -fomit-frame-pointer
Comment 15 Jorn Wolfgang Rennecke 2005-03-08 20:39:05 UTC
Created attachment 8365 [details]
output of patched compiler

This is the output of the patched gcc 4.1.0 20050307 for sh-elf -O2 -m4-single
-fomit-frame-pointer .	As you can see, the output of the high-latency
div instruction is fed into ftrv much later.
Comment 16 Andrew Pinski 2005-03-08 20:41:15 UTC
(In reply to comment #13)
> Created an attachment (id=8363)
> scheduling testcase

And that should be fixed via the structure aliasing improvements that Daniel is working on.
Comment 17 Jorn Wolfgang Rennecke 2005-03-08 20:42:46 UTC
Created attachment 8366 [details]
patch against 4.1

patch wouldn't automatically apply all of the patch because some context
changed, so I re-diffed it.
Comment 18 Jorn Wolfgang Rennecke 2005-03-08 20:51:25 UTC
(In reply to comment #16)

> And that should be fixed via the structure aliasing improvements that Daniel
is working on.

Will this also work when a[0] .. a[2] are replaced with a0 .. a2 ? 

Comment 19 Jorn Wolfgang Rennecke 2005-08-02 18:59:47 UTC
(In reply to comment #4)
> > It is fixed via neither of the above but is fixed on the tree-profiling
branch via the IPA statics stuff which 
> Kenney Zadeck is working on.
> 
> This is also a related to PR 19905, it will most likely be solved by the same
patch also.


The promote statics patch has been removed from mainline, so this issue is
open again.
Comment 20 Andrew Pinski 2005-08-02 19:04:05 UTC
(In reply to comment #19)
> The promote statics patch has been removed from mainline, so this issue is
> open again.

The improved aliasing was not reverted though which is what gets most of the improvements and not 
the promote statics pass.
Comment 21 Jorn Wolfgang Rennecke 2005-08-02 19:19:41 UTC
(In reply to comment #20)
 
> The improved aliasing was not reverted though which is what gets most of the
improvements and not 
> the promote statics pass.

The scheduling testcase still suffers from insufficient alias information.

Comment 22 Daniel Berlin 2005-08-02 22:19:50 UTC
Subject: Re:  alias analysis doesn't take into
	account that variables that haven't their address taken can't alias
	arbitrary MEMs

On Tue, 2005-08-02 at 19:19 +0000, amylaar at gcc dot gnu dot org wrote:
> ------- Additional Comments From amylaar at gcc dot gnu dot org  2005-08-02 19:19 -------
> (In reply to comment #20)
>  
> > The improved aliasing was not reverted though which is what gets most of the
> improvements and not 
> > the promote statics pass.
> 
> The scheduling testcase still suffers from insufficient alias information.

And it will until all of the info about what variables, etc, we have is
kept through scheduling.


Comment 23 Steven Bosscher 2009-04-05 12:38:35 UTC
Could it be that this is now fixed by the a-i-b merge?
Comment 24 Richard Biener 2009-04-05 20:17:38 UTC
Which testcase?  This bug is mildly confusing.  Note that i is a global
variable, it may be reached via recursive invocations of the function and
we do not have analysis that disproves this.
Comment 25 Jorn Wolfgang Rennecke 2009-04-06 00:37:25 UTC
(In reply to comment #24)
> Which testcase?

The second attachment, 20367-sched.c

>  This bug is mildly confusing.  Note that i is a global
> variable,

If you mean i in f of comment #2, that's not gloabal, but static.

> it may be reached via recursive invocations of the function and
> we do not have analysis that disproves this.

I suppose you mean that f may be reached via recursion, thus
clobbering i, as already mentioned in comment #7.

Unfortunately, the original testcase went into the big bit bucket in the sky,
and when you try to make up a minimal example on the spot, you are bound to
overlook the very same aspects of the problem you have overlooked in the
work on the problm so far.

But there is still code in EEMBC that genuinely benefits from load hoisting /
store sinking and scheduling which is enabled by recognizing that a static
variable that hasn't got its address taken can't alias another MEM - in some
cases, the static variable can be completely replaced by a local variable,
in other cases, it can't (without whole-program analysis) because an
indirect recursive call cannot be ruled out.
Comment 26 Richard Biener 2010-05-09 17:09:55 UTC
alias-export is now merged so the RTL level should have the same alias information as the tree level.  And thus scheduling should be fixed.(?)
Comment 27 Richard Biener 2012-01-10 15:31:09 UTC
Should be fixed in at least 4.6.