undefined behavior in value_range::equiv_add()?

Richard Biener richard.guenther@gmail.com
Thu Jun 6 07:31:00 GMT 2019


On Wed, Jun 5, 2019 at 11:12 PM Jeff Law <law@redhat.com> wrote:
>
> On 6/4/19 9:04 AM, Richard Biener wrote:
> > On Tue, Jun 4, 2019 at 3:40 PM Jeff Law <law@redhat.com> wrote:
> >>
> >> On 6/4/19 5:23 AM, Richard Biener wrote:
> >>> On Tue, Jun 4, 2019 at 12:30 AM Jeff Law <law@redhat.com> wrote:
> >>>>
> >>>> On 6/3/19 7:13 AM, Aldy Hernandez wrote:
> >>>>> On 5/31/19 5:00 AM, Richard Biener wrote:
> >>>>>> On Fri, May 31, 2019 at 2:27 AM Jeff Law <law@redhat.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> On 5/29/19 10:20 AM, Aldy Hernandez wrote:
> >>>>>>>> On 5/29/19 12:12 PM, Jeff Law wrote:
> >>>>>>>>> On 5/29/19 9:58 AM, Aldy Hernandez wrote:
> >>>>>>>>>> On 5/29/19 9:24 AM, Richard Biener wrote:
> >>>>>>>>>>> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez
> >>>>>>>>>>> <aldyh@redhat.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> As per the API, and the original documentation
> >>>>>>>>>>>> to value_range, VR_UNDEFINED and VR_VARYING
> >>>>>>>>>>>> should never have equivalences. However,
> >>>>>>>>>>>> equiv_add is tacking on equivalences blindly,
> >>>>>>>>>>>> and there are various regressions that happen
> >>>>>>>>>>>> if I fix this oversight.
> >>>>>>>>>>>>
> >>>>>>>>>>>> void value_range::equiv_add (const_tree var,
> >>>>>>>>>>>> const value_range *var_vr, bitmap_obstack
> >>>>>>>>>>>> *obstack) { if (!m_equiv) m_equiv =
> >>>>>>>>>>>> BITMAP_ALLOC (obstack); unsigned ver =
> >>>>>>>>>>>> SSA_NAME_VERSION (var); bitmap_set_bit
> >>>>>>>>>>>> (m_equiv, ver); if (var_vr && var_vr->m_equiv)
> >>>>>>>>>>>> bitmap_ior_into (m_equiv, var_vr->m_equiv); }
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is this a bug in the documentation / API, or is
> >>>>>>>>>>>> equiv_add incorrect and we should fix the
> >>>>>>>>>>>> fall-out elsewhere?
> >>>>>>>>>>>
> >>>>>>>>>>> I think this must have been crept in during the
> >>>>>>>>>>> classification. If you go back to say GCC 7 you
> >>>>>>>>>>> shouldn't see value-ranges with UNDEFINED/VARYING
> >>>>>>>>>>> state in the lattice that have equivalences.
> >>>>>>>>>>>
> >>>>>>>>>>> It may not be easy to avoid with the new classy
> >>>>>>>>>>> interface but we're certainly not tacking on them
> >>>>>>>>>>> "blindly".  At least we're not supposed to.  As
> >>>>>>>>>>> usual the intermediate state might be "broken"
> >>>>>>>>>>> but intermediateness is not sth the new class
> >>>>>>>>>>> "likes".
> >>>>>>>>>>
> >>>>>>>>>> It looks like extract_range_from_stmt (by virtue
> >>>>>>>>>> of vrp_visit_assignment_or_call and then
> >>>>>>>>>> extract_range_from_ssa_name) returns one of these
> >>>>>>>>>> intermediate ranges.  It would seem to me that an
> >>>>>>>>>> outward looking API method like
> >>>>>>>>>> vr_values::extract_range_from_stmt shouldn't be
> >>>>>>>>>> returning inconsistent ranges.  Or are there no
> >>>>>>>>>> guarantees for value_ranges from within all of
> >>>>>>>>>> vr_values?
> >>>>>>>>> ISTM that if we have an implementation constraint
> >>>>>>>>> that says a VR_VARYING or VR_UNDEFINED range can't
> >>>>>>>>> have equivalences, then we need to honor that at the
> >>>>>>>>> minimum for anything returned by an external API.
> >>>>>>>>> Returning an inconsistent state is bad.  I'd even
> >>>>>>>>> state that we should try damn hard to avoid it in
> >>>>>>>>> internal APIs as well.
> >>>>>>>>
> >>>>>>>> Agreed * 2.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Perhaps I should give a little background.  As part
> >>>>>>>>>> of your value_range_base re-factoring last year,
> >>>>>>>>>> you mentioned that you didn't split out intersect
> >>>>>>>>>> like you did union because of time or oversight.
> >>>>>>>>>> I have code to implement intersect (attached), for
> >>>>>>>>>> which I've noticed that I must leave equivalences
> >>>>>>>>>> intact, even when transitioning to VR_UNDEFINED:
> >>>>>>>>>>
> >>>>>>>>>> [from the attached patch] +  /* If THIS is varying
> >>>>>>>>>> we want to pick up equivalences from OTHER. +
> >>>>>>>>>> Just special-case this here rather than trying to
> >>>>>>>>>> fixup after the +     fact.  */ +  if
> >>>>>>>>>> (this->varying_p ()) +    this->deep_copy (other);
> >>>>>>>>>> +  else if (this->undefined_p ()) +    /* ?? Leave
> >>>>>>>>>> any equivalences already present in an undefined. +
> >>>>>>>>>> This is technically not allowed, but we may get an
> >>>>>>>>>> in-flight +       value_range in an intermediate
> >>>>>>>>>> state.  */
> >>>>>>>>> Where/when does this happen?
> >>>>>>>>
> >>>>>>>> The above snippet is not currently in mainline.  It's
> >>>>>>>> in the patch I'm proposing to clean up intersect.  It's
> >>>>>>>> just that while cleaning up intersect I noticed that if
> >>>>>>>> we keep to the value_range API, we end up clobbering an
> >>>>>>>> equivalence to a VR_UNDEFINED that we depend up further
> >>>>>>>> up the call chain.
> >>>>>>>>
> >>>>>>>> The reason it doesn't happen in mainline is because
> >>>>>>>> intersect_helper bails early on an undefined, thus
> >>>>>>>> leaving the problematic equivalence intact.
> >>>>>>>>
> >>>>>>>> You can see it in mainline though, with the following
> >>>>>>>> testcase:
> >>>>>>>>
> >>>>>>>> int f(int x) { if (x != 0 && x != 1) return -2;
> >>>>>>>>
> >>>>>>>> return !x; }
> >>>>>>>>
> >>>>>>>> Break in evrp_range_analyzer::record_ranges_from_stmt()
> >>>>>>>> and see that the call to extract_range_from_stmt()
> >>>>>>>> returns a VR of undefined *WITH* equivalences:
> >>>>>>>>
> >>>>>>>> vr_values->extract_range_from_stmt (stmt, &taken_edge,
> >>>>>>>> &output, &vr);
> >>>>>>>>
> >>>>>>>> This VR is later fed to update_value_range() and
> >>>>>>>> ultimately intersect(), which in mainline, leaves the
> >>>>>>>> equivalences alone, but IMO should respect that API and
> >>>>>>>> nuke them.
> >>>>>>> So I think this is coming from
> >>>>>>> extract_range_from_ssa_name:
> >>>>>>>
> >>>>>>>> void vr_values::extract_range_from_ssa_name
> >>>>>>>> (value_range *vr, tree var) { value_range *var_vr =
> >>>>>>>> get_value_range (var);
> >>>>>>>>
> >>>>>>>> if (!var_vr->varying_p ()) vr->deep_copy (var_vr);
> >>>>>>>> else vr->set (var);
> >>>>>>>>
> >>>>>>>> vr->equiv_add (var, get_value_range (var),
> >>>>>>>> &vrp_equiv_obstack); }
> >>>>>>>
> >>>>>>> Where we blindly add VAR to the equiv bitmap in the
> >>>>>>> range.
> >>>>>>>
> >>>>>>> AFAICT gcc-7 has equivalent code, it's just not inside
> >>>>>>> the class.
> >>>>>>>
> >>>>>>> I don't know what the fallout might be, but we could
> >>>>>>> conditionalize the call to add_equivalence to avoid the
> >>>>>>> inconsistent state.
> >>>>>>
> >>>>>> Well, the issue is that the equivalence _is_ there.  So
> >>>>>> above we know that we never end up with VARYING but the
> >>>>>> deep_copy can bring over UNDEFINED to us.  I guess we
> >>>>>> _could_ elide the equiv adding then.  OTOH the equivalence
> >>>>>> does look useful for predicate simplification which IIRC
> >>>>>> doesn't treat UNDEFINED != x in a useful way.  Which would
> >>>>>> mean allowing equivalences on UNDEFINED.  That somewhat
> >>>>>> makes sense since UNDEFINED is bottom-most in the lattice
> >>>>>> and one up (CONSTANT) we have equivalences while just on
> >>>>>> topmost (VARYING) we do not.
> >>>>>>
> >>>>>> That said, I never liked equivalences - they are an odd
> >>>>>> way to represent ranges intersected from multiple ranges
> >>>>>> thus a way out of our too simplistic range representation.
> >>>>>>
> >>>>>> So lets fix this in a way avoiding testsuite fallout.  But
> >>>>>> please not with special hacks in consumers.  Does guariding
> >>>>>> the above equiv_add with !vr->undefined_p () fix things?
> >>>>>
> >>>>> Agreed.  I never suggested we add special hacks to intersect.
> >>>>> The two-liner was only to explain why equivalences in
> >>>>> varying/undefined currently matter in intersect.
> >>>>>
> >>>>> Guarding the equiv_add causes regressions.  For example, for
> >>>>> this simple testcase we incorrectly get rid of the final
> >>>>> PHI:
> >>>>>
> >>>>> int f(int x) { if (x != 0 && x != 1) return -2;
> >>>>>
> >>>>> return !x; }
> >>>>>
> >>>>> In the evrp dump we now see:
> >>>>>
> >>>>> Visiting PHI node: _3 = PHI <-2(3), _5(4)> Argument #0 (3 ->
> >>>>> 5 executable) -2: int [-2, -2] Argument #1 (4 -> 5
> >>>>> executable) _5: UNDEFINED Meeting int [-2, -2] and UNDEFINED
> >>>>> to int [-2, -2]
> >>>>>
> >>>>> whereas before the change, argument #1 was:
> >>>>>
> >>>>> Argument #1 (4 -> 5 executable) _5: int [_5, _5]
> >>>>>
> >>>>> and the meeting result was VARYING.
> >>>>>
> >>>>> I *think* this starts somewhere up the call chain in
> >>>>> update_value_range, which as I've described earlier, will no
> >>>>> longer make a difference between UNDEFINED and UNDEFINED +
> >>>>> equivalence.  This causes that when transitioning from
> >>>>> UNDEFINED to UNDEFINED + equivalence, we actually transition
> >>>>> to VARYING:
> >>>>>
> >>>>> if (is_new) { if (old_vr->varying_p ()) { new_vr->set_varying
> >>>>> (); is_new = false; } else if (new_vr->undefined_p ()) {
> >>>>> old_vr->set_varying (); new_vr->set_varying (); return true;
> >>>>> }
> >>>>>
> >>>>> Could someone better versed in this take a peek, please?
> >>>>> It's just a matter of applying the attached one-liner, and
> >>>>> looking at "Visiting PHI" in the evrp dump.
> >>>> As I mentioned in IRC.  I know what's going on here and see a
> >>>> path to fix it.  Hoping to get a patch into my tester
> >>>> overnight, but life seems to be getting in the way.
> >>>
> >>> It's of course a latent issue.  One I fixed in DOMs use of the
> >>> EVRP machinery. The following fixes it in a conservative way:
> >>>
> >>> Index: gcc/gimple-ssa-evrp.c
> >>> ===================================================================
> >>>
> >>>
> --- gcc/gimple-ssa-evrp.c       (revision 271904)
> >>> +++ gcc/gimple-ssa-evrp.c       (working copy) @@ -175,6 +175,8
> >>> @@ evrp_dom_walker::before_dom_children (ba
> >>>
> >>> /* Try folding stmts with the VR discovered.  */ bool did_replace
> >>> = evrp_folder.replace_uses_in (stmt); +      gimple_stmt_iterator
> >>> prev_gsi = gsi; +      gsi_prev (&prev_gsi); if (fold_stmt (&gsi,
> >>> follow_single_use_edges) || did_replace) { @@ -191,6 +193,17 @@
> >>> evrp_dom_walker::before_dom_children (ba
> >>>
> >>> if (did_replace) { +         /* If we wound up generating new
> >>> stmts during folding +            drop all their defs to
> >>> VARYING. +            ???  Properly process them.  */ +
> >>> if (gsi_end_p (prev_gsi)) +           prev_gsi = gsi_start_bb
> >>> (bb); +         while (gsi_stmt (prev_gsi) != gsi_stmt (gsi)) +
> >>> { +             evrp_range_analyzer.get_vr_values () +
> >>> ->set_defs_to_varying (gsi_stmt (prev_gsi)); +
> >>> gsi_next (&prev_gsi); +           } /* If we cleaned up EH
> >>> information from the statement, remove EH edges.  */ if
> >>> (maybe_clean_or_replace_eh_stmt (old_stmt, stmt))
> >>>
> >>> the more "proper" fix requires keeping track of stmts already
> >>> processed (you don't want to re-process stmts, I ran into
> >>> "issues" when doing that for the same issue in DOM).  For DOM I
> >>> used the visited flag (see
> >>> dom_opt_dom_walker::before_dom_children).
> >>>
> >>> You might want to copy that handling.
> >>>
> >>> Just give me a note if you want to work on that, otherwise I'll
> >>> try to queue it for somewhen this week.Umm, I think you're
> >>> working around a problem in the simplifier code
> >> which creates SSA_NAMEs, leaving them in an VR_UNDEFINED state.
> >
> > But the simplifier is / can be just fold_stmt () - it doesn't know
> > it has to populate a value-range lattice.  But yes, the VRP specific
> > simplifier likely has the same issue in the EVRP context (in SSA VRP
> > all "new" SSA names are beyond the static allocated array and get
> > VARYING automatically).  So I think handling the "new stmts" in the
> > propagator is correct?
> I didn't realize we had so many calls to make_ssa_name in gimple-fold.

It's not only those literally appearing but when match.pd simplifies
to an expression that requires a temporary we create one as well.

> I was primarily concerned with the ones in the evrp simplification
> engine which are easy to fix in-place.  Looking at gimple-fold.c I agree
> we do need to address the walking problem.
>
> I'm not sure we can drop to varying though -- my first twiddle of this
> did precisely that, but that'll likely regress vrp47 where we know the
> resultant range after simplification is just [0,1].

Of course we do not need to drop the original LHS to VARYING, only
the defs we didn't already visit.

> Ideally we wouldn't have the simplifiers creating new names or we'd have
> a more robust mechanism for identifying when we've created a new name
> and doing the right thing.  I suspect whatever we do here right now is
> going to be a bandaid and as long as we keep creating new names in the
> simplifier we're likely going to be coming back and applying more bandaids.

Re-visiting the stmts would be possible if we'd split registering ranges derived
from uses of a stmt from those of defs (IIRC I had incomplete patches to do that
when trying to derive X != 0 from Y / X because otherwise we miscompile stuff).

Meanwhile I have bootstrapped / tested the following which does the VARYING
thing.

Applied to trunk.  I think we need to backport this since this is a latent
wrong-code issue.  We can see to improve things on the trunk incrementally.

Richard.

2019-06-06  Richard Biener  <rguenther@suse.de>

        * vr-values.c (vr_values::extract_range_from_ssa_name): Do not
        put equivalences on UNDEFINED ranges.
        * gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children):
        Make sure to drop defs of stmts added during simplification
        to VARYING.



> jeff
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p2-2
Type: application/octet-stream
Size: 2060 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20190606/a3a08761/attachment.obj>


More information about the Gcc-patches mailing list