This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR78153


On Wed, 23 Nov 2016, Prathamesh Kulkarni wrote:

> On 23 November 2016 at 15:16, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:
> >
> >> On 22 November 2016 at 20:53, Richard Biener <rguenther@suse.de> wrote:
> >> > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 22 November 2016 at 20:18, Richard Biener <rguenther@suse.de> wrote:
> >> >> > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:
> >> >> >
> >> >> >> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:
> >> >> >> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:
> >> >> >> >
> >> >> >> >> Hi,
> >> >> >> >> As suggested by Martin in PR78153 strlen's return value cannot exceed
> >> >> >> >> PTRDIFF_MAX.
> >> >> >> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
> >> >> >> >> in the attached patch.
> >> >> >> >>
> >> >> >> >> However it regressed strlenopt-3.c:
> >> >> >> >>
> >> >> >> >> Consider fn1() from strlenopt-3.c:
> >> >> >> >>
> >> >> >> >> __attribute__((noinline, noclone)) size_t
> >> >> >> >> fn1 (char *p, char *q)
> >> >> >> >> {
> >> >> >> >>   size_t s = strlen (q);
> >> >> >> >>   strcpy (p, q);
> >> >> >> >>   return s - strlen (p);
> >> >> >> >> }
> >> >> >> >>
> >> >> >> >> The optimized dump shows the following:
> >> >> >> >>
> >> >> >> >> __attribute__((noclone, noinline))
> >> >> >> >> fn1 (char * p, char * q)
> >> >> >> >> {
> >> >> >> >>   size_t s;
> >> >> >> >>   size_t _7;
> >> >> >> >>   long unsigned int _9;
> >> >> >> >>
> >> >> >> >>   <bb 2>:
> >> >> >> >>   s_4 = strlen (q_3(D));
> >> >> >> >>   _9 = s_4 + 1;
> >> >> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >> >> >> >>   _7 = 0;
> >> >> >> >>   return _7;
> >> >> >> >>
> >> >> >> >> }
> >> >> >> >>
> >> >> >> >> which introduces the regression, because the test expects "return 0;" in fn1().
> >> >> >> >>
> >> >> >> >> The issue seems to be in vrp2:
> >> >> >> >>
> >> >> >> >> Before the patch:
> >> >> >> >> Visiting statement:
> >> >> >> >> s_4 = strlen (q_3(D));
> >> >> >> >> Found new range for s_4: VARYING
> >> >> >> >>
> >> >> >> >> Visiting statement:
> >> >> >> >> _1 = s_4;
> >> >> >> >> Found new range for _1: [s_4, s_4]
> >> >> >> >> marking stmt to be not simulated again
> >> >> >> >>
> >> >> >> >> Visiting statement:
> >> >> >> >> _7 = s_4 - _1;
> >> >> >> >> Applying pattern match.pd:111, gimple-match.c:27997
> >> >> >> >> Match-and-simplified s_4 - _1 to 0
> >> >> >> >> Intersecting
> >> >> >> >>   [0, 0]
> >> >> >> >> and
> >> >> >> >>   [0, +INF]
> >> >> >> >> to
> >> >> >> >>   [0, 0]
> >> >> >> >> Found new range for _7: [0, 0]
> >> >> >> >>
> >> >> >> >> __attribute__((noclone, noinline))
> >> >> >> >> fn1 (char * p, char * q)
> >> >> >> >> {
> >> >> >> >>   size_t s;
> >> >> >> >>   long unsigned int _1;
> >> >> >> >>   long unsigned int _9;
> >> >> >> >>
> >> >> >> >>   <bb 2>:
> >> >> >> >>   s_4 = strlen (q_3(D));
> >> >> >> >>   _9 = s_4 + 1;
> >> >> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >> >> >> >>   _1 = s_4;
> >> >> >> >>   return 0;
> >> >> >> >>
> >> >> >> >> }
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> After the patch:
> >> >> >> >> Visiting statement:
> >> >> >> >> s_4 = strlen (q_3(D));
> >> >> >> >> Intersecting
> >> >> >> >>   [0, 9223372036854775806]
> >> >> >> >> and
> >> >> >> >>   [0, 9223372036854775806]
> >> >> >> >> to
> >> >> >> >>   [0, 9223372036854775806]
> >> >> >> >> Found new range for s_4: [0, 9223372036854775806]
> >> >> >> >> marking stmt to be not simulated again
> >> >> >> >>
> >> >> >> >> Visiting statement:
> >> >> >> >> _1 = s_4;
> >> >> >> >> Intersecting
> >> >> >> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> >> >> >> >> and
> >> >> >> >>   [0, 9223372036854775806]
> >> >> >> >> to
> >> >> >> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> >> >> >> >> Found new range for _1: [0, 9223372036854775806]
> >> >> >> >> marking stmt to be not simulated again
> >> >> >> >>
> >> >> >> >> Visiting statement:
> >> >> >> >> _7 = s_4 - _1;
> >> >> >> >> Intersecting
> >> >> >> >>   ~[9223372036854775807, 9223372036854775809]
> >> >> >> >> and
> >> >> >> >>   ~[9223372036854775807, 9223372036854775809]
> >> >> >> >> to
> >> >> >> >>   ~[9223372036854775807, 9223372036854775809]
> >> >> >> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]
> >> >> >> >> marking stmt to be not simulated again
> >> >> >> >>
> >> >> >> >> __attribute__((noclone, noinline))
> >> >> >> >> fn1 (char * p, char * q)
> >> >> >> >> {
> >> >> >> >>   size_t s;
> >> >> >> >>   long unsigned int _1;
> >> >> >> >>   size_t _7;
> >> >> >> >>   long unsigned int _9;
> >> >> >> >>
> >> >> >> >>   <bb 2>:
> >> >> >> >>   s_4 = strlen (q_3(D));
> >> >> >> >>   _9 = s_4 + 1;
> >> >> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >> >> >> >>   _1 = s_4;
> >> >> >> >>   _7 = s_4 - _1;
> >> >> >> >>   return _7;
> >> >> >> >>
> >> >> >> >> }
> >> >> >> >>
> >> >> >> >> Then forwprop4 turns
> >> >> >> >> _1 = s_4
> >> >> >> >> _7 = s_4 - _1
> >> >> >> >> into
> >> >> >> >> _7 = 0
> >> >> >> >>
> >> >> >> >> and we end up with:
> >> >> >> >> _7 = 0
> >> >> >> >> return _7
> >> >> >> >> in optimized dump.
> >> >> >> >>
> >> >> >> >> Running ccp again after forwprop4 trivially solves the issue, however
> >> >> >> >> I am not sure if we want to run ccp again ?
> >> >> >> >>
> >> >> >> >> The issue is probably with extract_range_from_ssa_name():
> >> >> >> >> For _1 = s_4
> >> >> >> >>
> >> >> >> >> Before patch:
> >> >> >> >> VR for s_4 is set to varying.
> >> >> >> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
> >> >> >> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,
> >> >> >> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
> >> >> >> >> match.pd pattern x - x -> 0).
> >> >> >> >>
> >> >> >> >> After patch:
> >> >> >> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]
> >> >> >> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
> >> >> >> >> so IIUC, we then lose the information that _1 is equal to s_4,
> >> >> >> >
> >> >> >> > We don't lose it, it's in its set of equivalencies.
> >> >> >> Ah, I missed that, thanks. For some reason I had mis-conception that
> >> >> >> equivalences stores
> >> >> >> variables which have same value-ranges but are not necessarily equal.
> >> >> >> >
> >> >> >> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
> >> >> >> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.
> >> >> >> >> Does this sound correct ?
> >> >> >> >
> >> >> >> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls
> >> >> >> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
> >> >> >> > we do not have a singleton VR_RANGE does not fall back to looking
> >> >> >> > at equivalences (there's not a good cheap way to do that currently because
> >> >> >> > VRP doesn't keep a proper copy lattice but simply IORs equivalences
> >> >> >> > from all equivalences).  In theory simply using the first set bit
> >> >> >> > might work.  Thus sth like
> >> >> >> >
> >> >> >> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
> >> >> >> >               || is_gimple_min_invariant (vr->min))
> >> >> >> >           && vrp_operand_equal_p (vr->min, vr->max))
> >> >> >> >         return vr->min;
> >> >> >> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
> >> >> >> > +       {
> >> >> >> > +         unsigned num = bitmap_first_set_bit (vr->equiv);
> >> >> >> > +         if (num < SSA_NAME_VERSION (name))
> >> >> >> > +           return ssa_name (num);
> >> >> >> > +       }
> >> >> >> >      }
> >> >> >> >    return name;
> >> >> >> >  }
> >> >> >> >
> >> >> >> > might work with the idea of simply doing canonicalization to one of
> >> >> >> > the equivalences.  But as we don't allow copies in the SSA def stmt
> >> >> >> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.
> >> >> >> IIUC, we record the equivalent variables in vr->equiv
> >> >> >> but do not canonicalize to one of the equivalence like "copy-of value"
> >> >> >> in copyprop ?
> >> >> >> Using first set bit unfortunately doesn't help for the above case.
> >> >> >>
> >> >> >> Sorry if this sounds silly, should we just run copyprop/ccp once again
> >> >> >> after vrp2 to ensure that there are no copies left ?
> >> >> >
> >> >> > why?  forwprop also does copy and constant propagation.  For the
> >> >> > regression simply adjust the pass dump you scan.
> >> >> Well, with the patch the redundant store to and load from _7 still remains
> >> >> in optimized dump for fn1() in strlenopt-3.c:
> >> >>
> >> >> __attribute__((noclone, noinline))
> >> >> fn1 (char * p, char * q)
> >> >> {
> >> >>   size_t s;
> >> >>   size_t _7;
> >> >>   long unsigned int _9;
> >> >>
> >> >>   <bb 2>:
> >> >>   s_4 = strlen (q_3(D));
> >> >>   _9 = s_4 + 1;
> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
> >> >>   _7 = 0;
> >> >>   return _7;
> >> >>
> >> >> }
> >> >>
> >> >> Running ccp again after forwprop4 would get rid of _7.
> >> >> Without the patch we have return _0; in optimized dump.
> >> >
> >> > Ah, but then that's a missing "folding" of the return.  It's not
> >> > a load/store anyway.
> >> Hi Richard,
> >> Thanks for the suggestion. In the attached untested patch, I tried to
> >> modify forwprop to fold return-value to constant.
> >> The optimized dump shows return 0; for the above test-case with this patch.
> >> Does it look OK ?
> >
> > No, the fix is to make fold_stmt_1 handle GIMPLE_RETURN and simply
> > valueize the return value (note 'valueize' might return NULL or be NULL).
> >
> Hi Richard,
> Does the attached patch look OK ? I verified it prevents the
> regression for above case.
> Bootstrap+test on x86_64-unknown-linux-gnu in progress.

+           tree val = valueize (ret);
+           if (val && TREE_CONSTANT (val))
+             {

ok apart from the TREE_CONSTANT check which should be necessary
(it misses applying copy propagation).

Thanks,
Richard.

> Thanks,
> Prathamesh
> > Richard.
> >
> >>
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Richard.
> >> >
> >> >> Thanks,
> >> >> Prathamesh
> >> >> >
> >> >> >> However that might be quite expensive ?
> >> >> >> Or make vrp track copies like copyprop using a separate copy-of lattice ?
> >> >> >
> >> >> > Ideally we'd unify the three SSA propagation passes into one.  We'd
> >> >> > have to have separate lattices for copy&constant and range&known-bits.
> >> >> >
> >> >> > Richard.
> >> >> >
> >> >> >> Thanks,
> >> >> >> Prathamesh
> >> >> >> >
> >> >> >> > Richard.
> >> >> >>
> >> >> >>
> >> >> >
> >> >> > --
> >> >> > Richard Biener <rguenther@suse.de>
> >> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >> >>
> >> >>
> >> >
> >> > --
> >> > Richard Biener <rguenther@suse.de>
> >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]