[RFC/PATCH] updating global ranges and their effect on __builtin_unreachable code

Richard Biener richard.guenther@gmail.com
Mon Jun 7 13:45:25 GMT 2021


On Mon, Jun 7, 2021 at 3:37 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 6/7/21 3:25 AM, Richard Biener wrote:
> > On Wed, Jun 2, 2021 at 2:53 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >> On 6/2/21 7:52 AM, Richard Biener wrote:
> >>> On Wed, Jun 2, 2021 at 12:34 PM Aldy Hernandez via Gcc-patches
> >>> <gcc-patches@gcc.gnu.org> wrote:
> >>>> We've been having "issues" in our branch when exporting to the global
> >>>> space ranges that take into account previously known ranges
> >>>> (SSA_NAME_RANGE_INFO, etc).  For the longest time we had the export
> >>>> feature turned off because it had the potential of removing
> >>>> __builtin_unreachable code early in the pipeline.  This was causing one
> >>>> or two tests to fail.
> >>>>
> >>>> I finally got fed up, and investigated why.
> >>>>
> >>>> Take the following code:
> >>>>
> >>>>      i_4 = somerandom ();
> >>>>      if (i_4 < 0)
> >>>>        goto <bb 3>; [INV]
> >>>>      else
> >>>>        goto <bb 4>; [INV]
> >>>>
> >>>>      <bb 3> :
> >>>>      __builtin_unreachable ();
> >>>>
> >>>>      <bb 4> :
> >>>>
> >>>> It turns out that both legacy evrp and VRP have code that notices the
> >>>> above pattern and sets the *global* range for i_4 to [0,MAX].  That is,
> >>>> the range for i_4 is set, not at BB4, but at the definition site.  See
> >>>> uses of assert_unreachable_fallthru_edge_p() for details.
> >>>>
> >>>> This global range causes subsequent passes (VRP1 in the testcase below),
> >>>> to remove the checks and the __builtin_unreachable code altogether.
> >>>>
> >>>> // pr80776-1.c
> >>>> int somerandom (void);
> >>>> void
> >>>> Foo (void)
> >>>> {
> >>>>      int i = somerandom ();
> >>>>      if (! (0 <= i))
> >>>>        __builtin_unreachable ();
> >>>>      if (! (0 <= i && i <= 999999))
> >>>>        __builtin_unreachable ();
> >>>>      sprintf (number, "%d", i);
> >>>> }
> >>>>
> >>>> This means that by the time the -Wformat-overflow warning runs, the
> >>>> above sprintf has been left unguarded, and a bogus warning is issued.
> >>>>
> >>>> Currently the above test does not warn, but that's because of an
> >>>> oversight in export_global_ranges().  This function is disregarding
> >>>> known global ranges (SSA_NAME_RANGE_INFO and SSA_NAME_PTR_INFO) and only
> >>>> setting ranges the ranger knows about.
> >>>>
> >>>> For the above test the IL is:
> >>>>
> >>>>      <bb 2> :
> >>>>      i_4 = somerandom ();
> >>>>      if (i_4 < 0)
> >>>>        goto <bb 3>; [INV]
> >>>>      else
> >>>>        goto <bb 4>; [INV]
> >>>>
> >>>>      <bb 3> :
> >>>>      __builtin_unreachable ();
> >>>>
> >>>>      <bb 4> :
> >>>>      i.0_1 = (unsigned int) i_4;
> >>>>      if (i.0_1 > 999999)
> >>>>        goto <bb 5>; [INV]
> >>>>      else
> >>>>        goto <bb 6>; [INV]
> >>>>
> >>>>      <bb 5> :
> >>>>      __builtin_unreachable ();
> >>>>
> >>>>      <bb 6> :
> >>>>      _7 = __builtin___sprintf_chk (&number, 1, 7, "%d", i_4);
> >>>>
> >>>>
> >>>> Legacy evrp has determined that the range for i_4 is [0,MAX] per my
> >>>> analysis above, but ranger has no known range for i_4 at the definition
> >>>> site.  So at export_global_ranges time, ranger leaves the [0,MAX] alone.
> >>>>
> >>>> OTOH, evrp sets the global range at the definition for i.0_1 to
> >>>> [0,999999] per the same unreachable feature.  However, ranger has
> >>>> correctly determined that the range for i.0_1 at the definition is
> >>>> [0,MAX], which it then proceeds to export.  Since the current
> >>>> export_global_ranges (mistakenly) does not take into account previous
> >>>> global ranges, the ranges in the global tables end up like this:
> >>>>
> >>>> i_4: [0, MAX]
> >>>> i.0_1: [0, MAX]
> >>>>
> >>>> This causes the first unreachable block to be removed in VRP1, but the
> >>>> second one to remain.  Later VRP can determine that i_4 in the sprintf
> >>>> call is [0,999999], and no warning is issued.
> >>>>
> >>>> But... the missing bogus warning is due to current export_global_ranges
> >>>> ignoring SSA_NAME_RANGE_INFO and friends, something which I'd like to
> >>>> fix.  However, fixing this, gets us back to:
> >>>>
> >>>> i_4: [0, MAX]
> >>>> i.0_1: [0, 999999]
> >>>>
> >>>> Which means, we'll be back to removing the unreachable blocks and
> >>>> issuing a warning in pr80776-1.c (like we have been since the beginning
> >>>> of time).
> >>>>
> >>>> The attached patch fixes export_global_ranges to the expected behavior,
> >>>> and adds the previous XFAIL to pr80776-1.c, while documenting why this
> >>>> warning is issued in the first place.
> >>>>
> >>>> Once legacy evrp is removed, this won't be an issue, as ranges in the IL
> >>>> will tell the truth.  However, this will mean that we will no longer
> >>>> remove the first __builtin_unreachable combo.  But ISTM, that would be
> >>>> correct behavior ??.
> >>>>
> >>>> BTW, in addition to this patch we could explore removing the
> >>>> assert_unreachable_fallthru_edge_p() use in the evrp_analyzer, since it
> >>>> is no longer needed to get the warnings in the testcases in the original
> >>>> PR correctly (gcc.dg/pr80776-[12].c).
> >>> But the whole point of all this singing and dancing is not to make
> >>> warnings but to be able to implement assert (); or assume (); that
> >>> will result in no code but optimization based on the assumption.
> >>>
> >>> That means that all the checks guarding __builtin_unreachable ()
> >>> should be removed at the GIMPLE level - just not too early
> >>> to preserve range info on the variables participating in the
> >>> guarding condition.
> >>>
> >>> So yes, it sounds fragile but instead it's carefully architected.  Heh.
> >>>
> >>> In particular it is designed so that early optimization leaves those
> >>> unreachable () around (for later LTO consumption and inlining, etc.
> >>> to be able to re-create the ranges) whilst VRP1 / DOM will end up
> >>> eliminating them.  I think we have testcases that verify said behavior,
> >>> namely optimize out range checks based on the assertions - maybe missed
> >>> the case where this only happens after inlining (important for your friendly
> >>> C++ abstraction hell), and the unreachable()s gone.
> >>>
> >>> Please make sure to not break that.
> >> Let me see if I understand...  we want to leave builtin_unreachable
> >> around until a certain point in compilation, and then remove them?
> >>
> >> It seems to me that either
> >>
> >>    A) a builtin_unreachable () provides useful information,( ie,
> >> information that isn't otherwise obvious, and would therefore stay in
> >> the IL and not be eliminated by the various optimizations until it is..
> >> ). OR
> >>
> >>    B )its telling us something is we can already figure out, in which
> >> case we will naturally eliminate the branches leading to it, and then it
> >> goes away on its own.
> >>
> >> By that standard, any builtin_unreachable that makes it late into the
> >> optimization cycle is useful... So can't we just leave those until
> >> whatever point it is we decide the information they provide is no longer
> >> needed, and then simply drop them?  Rewrite any branch to an unreachable
> >> so that its truly unreachable, and then the next cfg cleanup makes them
> >> all go away?  That could be part of a late/final VRP or a pass of its own.
> >>
> >> This whole checking to see if they are there seems fragile, because it
> >> doesn't tell us whether they are useful or not..
> > It's a combination of A and B.  It's A) until after IPA and after IPA the
> > information is transitioned to range information on the SSA names
> > which is then persistent (but it for example does not survive all
> > IPA optimizations - definitely not LTO streaming for example).  So
> > we're transitioning from a if (a) __builtin_unreachable () representation
> > of the useful information to representing it by on-the-side info
> > (SSA_NAME_RANGE_INFO)
> > on 'a'.
> >
> > That should work fine with ranger as well I think - now, what was fragile
> > was that the way the if (a) __builtin_unreachable () IL was preserved
> > was simply that there's a single EVRP pass before IPA only and the
> > way it is (was) structured makes sure that once we reflect the IL on
> > the range info of 'a' it is not used (in the same EVRP pass) to elide
> > the condition and nothing does (did) that before VRP1 either.
> >
> > Richard.
>
> OK, lets see if I follow. The problem is that if we do eliminate the if
> and __builtin_unreachable too early, LTO streaming is likely to/might
> lose the information? and maybe some of the other IPA passes will not
> maintain the ssa_name_range_info data and thus also lose the info.

Yep.

> So once IPA is done, we'd be free to do eliminate at will when they
> aren't useful any morer?

Yes, more like "their information is readily available elsewhere".

>    And if I continue to follow that logic,
> then ranger shouldn't start with global information from the
> SSA_NAME_RANGE_INFO until post-IPA... then its free to pick up that info
> and eliminate anything it can.

That sounds like one possible solution.

> At least until IPA and LTO are 100% range propagators :-)  so in theory,
> we could have a flag for post-ipa that allows us to pick better
> starting  global ranges when they are available.   we could wrap that up
> with the enable_ranger() call.. checks to see what pass we're in and if
> post-ipa enables global access?  or is there such a query already available?

There's cfun->after_inlining - note that with ranger what uses ranges and
simplifies things is a bit muddy to me - we'd still need to make sure the
info eventually reaches SSA_NAME_RANGE_INFO before it's deleted.

> Andrew
>
> PS Ah, and now that I re-read, i think we're both saying the same thing?
> :-)  so it boils down to how do we check post-ipa.
>
> Is there any movement to making LTO maintain any range info it currently
> loses?  Maybe this is an opportunity :-)  I know it maintains some,
> because it was part of the reason we dont use globals now

So historically inlining preserves almost none of the on-the-side data, but
nowadays it transfers at least points-to and range-info.  For LTO we simply
do not stream that (one could do this in {output,input_ssa_names}) - I suppose
because initially there wasn't any (there was no EVRP, first VRP was after IPA).

But it's also that SSA_NAME_RANGE_INFO is prone to be "dropped"
and thus re-computing things after final inlining and initial scalar cleanups
after inlining tends to make it more readily available.

Richard.

>


More information about the Gcc-patches mailing list