Summary: | Function call foils VRP/jump-threading of redundant predicate on struct member | ||
---|---|---|---|
Product: | gcc | Reporter: | Ryan Johnson <scovich> |
Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | NEW --- | ||
Severity: | enhancement | CC: | dimhen, rguenth, steven |
Priority: | P3 | Keywords: | alias, missed-optimization |
Version: | 4.6.1 | ||
Target Milestone: | --- | ||
Host: | x86_64-unknown-linux-gnu | Target: | |
Build: | Known to work: | ||
Known to fail: | 4.5.2, 4.7.0 | Last reconfirmed: | 2021-08-10 00:00:00 |
Bug Depends on: | |||
Bug Blocks: | 19794, 23384 |
Description
Ryan Johnson
2011-09-10 03:29:49 UTC
Confirmed. Here is the .143t.optimized dump for trunk r178747: void test() () { struct foo f; bool D.2119; bool retval.0; <bb 2>: # .MEM_13 = VDEF <.MEM_8(D)> f.b = 0; # .MEM_10 = VDEF <.MEM_13> retval.0_2 = bar (); if (retval.0_2 != 0) goto <bb 3>; else goto <bb 4>; <bb 3>: # .MEM_11 = VDEF <.MEM_10> f.b = 1; goto <bb 5>; <bb 4>: # VUSE <.MEM_10> D.2119_5 = f.b; if (D.2119_5 != 0) goto <bb 5>; else goto <bb 6>; <bb 5>: Invalid sum of incoming frequencies 5000, should be 3898 # .MEM_16 = PHI <.MEM_10(4), .MEM_11(3)> # .MEM_12 = VDEF <.MEM_16> foo::baz (&f); <bb 6>: Invalid sum of incoming frequencies 8898, should be 10000 # .MEM_7 = PHI <.MEM_10(4), .MEM_12(5)> # VUSE <.MEM_7> return; } Note how the call to bar() clobbers .MEM_13 which is f.b. Alias related => Richi in CC. So, is this a C++ front-end issue? tree-optimization? Well, it's a tree optimization issue. It's simple - the local aggregate f escapes the function via the member function call to baz: <bb 5>: foo::baz (&f); and as our points-to analysis is not flow-sensitive for memory/calls this causes f to be clobbered by the call to bar: <bb 2>: f.b = 0; # USE = nonlocal null { f } # CLB = nonlocal null { f } retval.0_2 = bar (); as neither the bodies of baz nor bar are visible there is nothing we can do here (short of re-doing points-to analysis flow-sensitive for memory). f.b is partially redundant, so you see later jump-threading at work optimizing the path following the f.b = true assignment. (In reply to comment #3) > Well, it's a tree optimization issue. It's simple - the local aggregate f > escapes the function via the member function call to baz: > > <bb 5>: > foo::baz (&f); > > and as our points-to analysis is not flow-sensitive for memory/calls this > causes f to be clobbered by the call to bar Is flow-sensitive analysis within single functions prohibitively expensive? All the papers I can find talk about whole-program analysis, where it's very expensive in both time and space; the best I could find (CGO'11 best paper) gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC. > > as neither the bodies of baz nor bar are visible there is nothing we can do Would knowing the body of bar() help if the latter cannot be inlined? On Wed, 12 Oct 2011, scovich at gmail dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346 > > --- Comment #4 from Ryan Johnson <scovich at gmail dot com> 2011-10-12 12:40:25 UTC --- > (In reply to comment #3) > > Well, it's a tree optimization issue. It's simple - the local aggregate f > > escapes the function via the member function call to baz: > > > > <bb 5>: > > foo::baz (&f); > > > > and as our points-to analysis is not flow-sensitive for memory/calls this > > causes f to be clobbered by the call to bar > > Is flow-sensitive analysis within single functions prohibitively expensive? All > the papers I can find talk about whole-program analysis, where it's very > expensive in both time and space; the best I could find (CGO'11 best paper) > gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC. It would need a complete rewrite, it isn't integratable into the current solver (which happens to be shared between IPA and non-IPA modes). > > as neither the bodies of baz nor bar are visible there is nothing we can do > > Would knowing the body of bar() help if the latter cannot be inlined? Not at present, but it's possible to improve mod-ref analysis on an IPA level then. Richard. (In reply to comment #5) > On Wed, 12 Oct 2011, scovich at gmail dot com wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346 > > > > --- Comment #4 from Ryan Johnson <scovich at gmail dot com> 2011-10-12 12:40:25 UTC --- > > (In reply to comment #3) > > > Well, it's a tree optimization issue. It's simple - the local aggregate f > > > escapes the function via the member function call to baz: > > > > > > <bb 5>: > > > foo::baz (&f); > > > > > > and as our points-to analysis is not flow-sensitive for memory/calls this > > > causes f to be clobbered by the call to bar > > > > Is flow-sensitive analysis within single functions prohibitively expensive? All > > the papers I can find talk about whole-program analysis, where it's very > > expensive in both time and space; the best I could find (CGO'11 best paper) > > gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC. > > It would need a complete rewrite, it isn't integratable into the current > solver (which happens to be shared between IPA and non-IPA modes). That makes sense... Wild idea: would it be possible to annotate references as "escaped" or "not escaped yet" ? Anything global or passed into the function would be marked as escaped, while anything allocated locally would start out as not escaped; assigning to an escaped location or passing to a function would mark it as escaped if it wasn't already. The status could be determined in linear time using local information only (= scalable), and would benefit strongly as inlining (IPA or not) eliminates escape points. Alternatively (or maybe it's really the same thing?), I could imagine an SSA "operation" which "moves" the non-escaped variable into an escaped one (which happens to live at the same address) just before it escapes? That might give the same effect with no changes to the current flow-insensitive algorithm, as long as the optimizer knew how to adjust things to account for inlining. On Wed, 7 Mar 2012, scovich at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346
>
> --- Comment #6 from Ryan Johnson <scovich at gmail dot com> 2012-03-07 13:31:19 UTC ---
> (In reply to comment #5)
> > On Wed, 12 Oct 2011, scovich at gmail dot com wrote:
> >
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346
> > >
> > > --- Comment #4 from Ryan Johnson <scovich at gmail dot com> 2011-10-12 12:40:25 UTC ---
> > > (In reply to comment #3)
> > > > Well, it's a tree optimization issue. It's simple - the local aggregate f
> > > > escapes the function via the member function call to baz:
> > > >
> > > > <bb 5>:
> > > > foo::baz (&f);
> > > >
> > > > and as our points-to analysis is not flow-sensitive for memory/calls this
> > > > causes f to be clobbered by the call to bar
> > >
> > > Is flow-sensitive analysis within single functions prohibitively expensive? All
> > > the papers I can find talk about whole-program analysis, where it's very
> > > expensive in both time and space; the best I could find (CGO'11 best paper)
> > > gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC.
> >
> > It would need a complete rewrite, it isn't integratable into the current
> > solver (which happens to be shared between IPA and non-IPA modes).
> That makes sense...
>
> Wild idea: would it be possible to annotate references as "escaped" or "not
> escaped yet" ? Anything global or passed into the function would be marked as
> escaped, while anything allocated locally would start out as not escaped;
> assigning to an escaped location or passing to a function would mark it as
> escaped if it wasn't already. The status could be determined in linear time
> using local information only (= scalable), and would benefit strongly as
> inlining (IPA or not) eliminates escape points.
Well, you can compute the clobber/use sets of individual function calls,
IPA PTA computes a simple mod-ref analysis this way. You can also
annotate functions whether they make arguments escape or whether it
reads from them or clobbers them.
The plan is to do some simple analysis and propagate that up the
callgraph, similar to pure-const analysis. The escape part could
be integrated there.
Richard.
(In reply to comment #7) > On Wed, 7 Mar 2012, scovich at gmail dot com wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346 > > > > --- Comment #6 from Ryan Johnson <scovich at gmail dot com> 2012-03-07 13:31:19 UTC --- > > (In reply to comment #5) > > > On Wed, 12 Oct 2011, scovich at gmail dot com wrote: > > > > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346 > > > > > > > > --- Comment #4 from Ryan Johnson <scovich at gmail dot com> 2011-10-12 12:40:25 UTC --- > > > > (In reply to comment #3) > > > > > Well, it's a tree optimization issue. It's simple - the local aggregate f > > > > > escapes the function via the member function call to baz: > > > > > > > > > > <bb 5>: > > > > > foo::baz (&f); > > > > > > > > > > and as our points-to analysis is not flow-sensitive for memory/calls this > > > > > causes f to be clobbered by the call to bar > > > > > > > > Is flow-sensitive analysis within single functions prohibitively expensive? All > > > > the papers I can find talk about whole-program analysis, where it's very > > > > expensive in both time and space; the best I could find (CGO'11 best paper) > > > > gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC. > > > > > > It would need a complete rewrite, it isn't integratable into the current > > > solver (which happens to be shared between IPA and non-IPA modes). > > That makes sense... > > > > Wild idea: would it be possible to annotate references as "escaped" or "not > > escaped yet" ? Anything global or passed into the function would be marked as > > escaped, while anything allocated locally would start out as not escaped; > > assigning to an escaped location or passing to a function would mark it as > > escaped if it wasn't already. The status could be determined in linear time > > using local information only (= scalable), and would benefit strongly as > > inlining (IPA or not) eliminates escape points. > > Well, you can compute the clobber/use sets of individual function calls, > IPA PTA computes a simple mod-ref analysis this way. You can also > annotate functions whether they make arguments escape or whether it > reads from them or clobbers them. > > The plan is to do some simple analysis and propagate that up the > callgraph, similar to pure-const analysis. The escape part could > be integrated there. That sounds really slick to have in general... but would it actually catch the test case above? What you describe seems to depend on test() having information about foo::baz() -- which it does not -- while analyzing the body of test() could at least identify the part of f's lifetime where it cannot possibly have escaped. Or does the local analysis come "for free" once those IPA changes are in place? On Wed, 7 Mar 2012, scovich at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346
>
> --- Comment #8 from Ryan Johnson <scovich at gmail dot com> 2012-03-07 14:28:29 UTC ---
> (In reply to comment #7)
> > On Wed, 7 Mar 2012, scovich at gmail dot com wrote:
> >
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346
> > >
> > > --- Comment #6 from Ryan Johnson <scovich at gmail dot com> 2012-03-07 13:31:19 UTC ---
> > > (In reply to comment #5)
> > > > On Wed, 12 Oct 2011, scovich at gmail dot com wrote:
> > > >
> > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50346
> > > > >
> > > > > --- Comment #4 from Ryan Johnson <scovich at gmail dot com> 2011-10-12 12:40:25 UTC ---
> > > > > (In reply to comment #3)
> > > > > > Well, it's a tree optimization issue. It's simple - the local aggregate f
> > > > > > escapes the function via the member function call to baz:
> > > > > >
> > > > > > <bb 5>:
> > > > > > foo::baz (&f);
> > > > > >
> > > > > > and as our points-to analysis is not flow-sensitive for memory/calls this
> > > > > > causes f to be clobbered by the call to bar
> > > > >
> > > > > Is flow-sensitive analysis within single functions prohibitively expensive? All
> > > > > the papers I can find talk about whole-program analysis, where it's very
> > > > > expensive in both time and space; the best I could find (CGO'11 best paper)
> > > > > gets it down to 20-30ms and 2-3MB per kLoC for up to ~300kLoC.
> > > >
> > > > It would need a complete rewrite, it isn't integratable into the current
> > > > solver (which happens to be shared between IPA and non-IPA modes).
> > > That makes sense...
> > >
> > > Wild idea: would it be possible to annotate references as "escaped" or "not
> > > escaped yet" ? Anything global or passed into the function would be marked as
> > > escaped, while anything allocated locally would start out as not escaped;
> > > assigning to an escaped location or passing to a function would mark it as
> > > escaped if it wasn't already. The status could be determined in linear time
> > > using local information only (= scalable), and would benefit strongly as
> > > inlining (IPA or not) eliminates escape points.
> >
> > Well, you can compute the clobber/use sets of individual function calls,
> > IPA PTA computes a simple mod-ref analysis this way. You can also
> > annotate functions whether they make arguments escape or whether it
> > reads from them or clobbers them.
> >
> > The plan is to do some simple analysis and propagate that up the
> > callgraph, similar to pure-const analysis. The escape part could
> > be integrated there.
>
> That sounds really slick to have in general... but would it actually catch the
> test case above? What you describe seems to depend on test() having information
> about foo::baz() -- which it does not -- while analyzing the body of test()
> could at least identify the part of f's lifetime where it cannot possibly have
> escaped.
>
> Or does the local analysis come "for free" once those IPA changes are in place?
No, the local analysis is what makes the IPA changes "free" ;) Of course
the local analysis would need to be flow sensitive.
Richard.
This is basically PR 23384 really. |