PR78153

Richard Biener rguenther@suse.de
Tue Nov 22 14:48:00 GMT 2016


On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:

> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:
> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> As suggested by Martin in PR78153 strlen's return value cannot exceed
> >> PTRDIFF_MAX.
> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
> >> in the attached patch.
> >>
> >> However it regressed strlenopt-3.c:
> >>
> >> Consider fn1() from strlenopt-3.c:
> >>
> >> __attribute__((noinline, noclone)) size_t
> >> fn1 (char *p, char *q)
> >> {
> >>   size_t s = strlen (q);
> >>   strcpy (p, q);
> >>   return s - strlen (p);
> >> }
> >>
> >> The optimized dump shows the following:
> >>
> >> __attribute__((noclone, noinline))
> >> fn1 (char * p, char * q)
> >> {
> >>   size_t s;
> >>   size_t _7;
> >>   long unsigned int _9;
> >>
> >>   <bb 2>:
> >>   s_4 = strlen (q_3(D));
> >>   _9 = s_4 + 1;
> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >>   _7 = 0;
> >>   return _7;
> >>
> >> }
> >>
> >> which introduces the regression, because the test expects "return 0;" in fn1().
> >>
> >> The issue seems to be in vrp2:
> >>
> >> Before the patch:
> >> Visiting statement:
> >> s_4 = strlen (q_3(D));
> >> Found new range for s_4: VARYING
> >>
> >> Visiting statement:
> >> _1 = s_4;
> >> Found new range for _1: [s_4, s_4]
> >> marking stmt to be not simulated again
> >>
> >> Visiting statement:
> >> _7 = s_4 - _1;
> >> Applying pattern match.pd:111, gimple-match.c:27997
> >> Match-and-simplified s_4 - _1 to 0
> >> Intersecting
> >>   [0, 0]
> >> and
> >>   [0, +INF]
> >> to
> >>   [0, 0]
> >> Found new range for _7: [0, 0]
> >>
> >> __attribute__((noclone, noinline))
> >> fn1 (char * p, char * q)
> >> {
> >>   size_t s;
> >>   long unsigned int _1;
> >>   long unsigned int _9;
> >>
> >>   <bb 2>:
> >>   s_4 = strlen (q_3(D));
> >>   _9 = s_4 + 1;
> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >>   _1 = s_4;
> >>   return 0;
> >>
> >> }
> >>
> >>
> >> After the patch:
> >> Visiting statement:
> >> s_4 = strlen (q_3(D));
> >> Intersecting
> >>   [0, 9223372036854775806]
> >> and
> >>   [0, 9223372036854775806]
> >> to
> >>   [0, 9223372036854775806]
> >> Found new range for s_4: [0, 9223372036854775806]
> >> marking stmt to be not simulated again
> >>
> >> Visiting statement:
> >> _1 = s_4;
> >> Intersecting
> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> >> and
> >>   [0, 9223372036854775806]
> >> to
> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> >> Found new range for _1: [0, 9223372036854775806]
> >> marking stmt to be not simulated again
> >>
> >> Visiting statement:
> >> _7 = s_4 - _1;
> >> Intersecting
> >>   ~[9223372036854775807, 9223372036854775809]
> >> and
> >>   ~[9223372036854775807, 9223372036854775809]
> >> to
> >>   ~[9223372036854775807, 9223372036854775809]
> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]
> >> marking stmt to be not simulated again
> >>
> >> __attribute__((noclone, noinline))
> >> fn1 (char * p, char * q)
> >> {
> >>   size_t s;
> >>   long unsigned int _1;
> >>   size_t _7;
> >>   long unsigned int _9;
> >>
> >>   <bb 2>:
> >>   s_4 = strlen (q_3(D));
> >>   _9 = s_4 + 1;
> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >>   _1 = s_4;
> >>   _7 = s_4 - _1;
> >>   return _7;
> >>
> >> }
> >>
> >> Then forwprop4 turns
> >> _1 = s_4
> >> _7 = s_4 - _1
> >> into
> >> _7 = 0
> >>
> >> and we end up with:
> >> _7 = 0
> >> return _7
> >> in optimized dump.
> >>
> >> Running ccp again after forwprop4 trivially solves the issue, however
> >> I am not sure if we want to run ccp again ?
> >>
> >> The issue is probably with extract_range_from_ssa_name():
> >> For _1 = s_4
> >>
> >> Before patch:
> >> VR for s_4 is set to varying.
> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,
> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
> >> match.pd pattern x - x -> 0).
> >>
> >> After patch:
> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]
> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
> >> so IIUC, we then lose the information that _1 is equal to s_4,
> >
> > We don't lose it, it's in its set of equivalencies.
> Ah, I missed that, thanks. For some reason I had mis-conception that
> equivalences stores
> variables which have same value-ranges but are not necessarily equal.
> >
> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.
> >> Does this sound correct ?
> >
> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls
> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
> > we do not have a singleton VR_RANGE does not fall back to looking
> > at equivalences (there's not a good cheap way to do that currently because
> > VRP doesn't keep a proper copy lattice but simply IORs equivalences
> > from all equivalences).  In theory simply using the first set bit
> > might work.  Thus sth like
> >
> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
> >               || is_gimple_min_invariant (vr->min))
> >           && vrp_operand_equal_p (vr->min, vr->max))
> >         return vr->min;
> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
> > +       {
> > +         unsigned num = bitmap_first_set_bit (vr->equiv);
> > +         if (num < SSA_NAME_VERSION (name))
> > +           return ssa_name (num);
> > +       }
> >      }
> >    return name;
> >  }
> >
> > might work with the idea of simply doing canonicalization to one of
> > the equivalences.  But as we don't allow copies in the SSA def stmt
> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.
> IIUC, we record the equivalent variables in vr->equiv
> but do not canonicalize to one of the equivalence like "copy-of value"
> in copyprop ?
> Using first set bit unfortunately doesn't help for the above case.
> 
> Sorry if this sounds silly, should we just run copyprop/ccp once again
> after vrp2 to ensure that there are no copies left ?

why?  forwprop also does copy and constant propagation.  For the 
regression simply adjust the pass dump you scan.

> However that might be quite expensive ?
> Or make vrp track copies like copyprop using a separate copy-of lattice ?

Ideally we'd unify the three SSA propagation passes into one.  We'd
have to have separate lattices for copy&constant and range&known-bits.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)



More information about the Gcc-patches mailing list