This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Make strlen range computations more conservative


On Tue, 7 Aug 2018, Martin Sebor wrote:

> On 08/07/2018 11:44 AM, Richard Biener wrote:
> > On August 7, 2018 4:37:00 PM GMT+02:00, Martin Sebor <msebor@gmail.com>
> > wrote:
> > > On 08/07/2018 02:51 AM, Richard Biener wrote:
> > > > On August 7, 2018 4:24:42 AM GMT+02:00, Martin Sebor
> > > <msebor@gmail.com> wrote:
> > > > > On 07/26/2018 02:55 AM, Richard Biener wrote:
> > > > > > On Wed, 25 Jul 2018, Martin Sebor wrote:
> > > > > > 
> > > > > > > > BUT - for the string_constant and c_strlen functions we are,
> > > > > > > > in all cases we return something interesting, able to look
> > > > > > > > at an initializer which then determines that type.  Hopefully.
> > > > > > > > I think the strlen() folding code when it sets SSA ranges
> > > > > > > > now looks at types ...?
> > > > > > > > 
> > > > > > > > Consider
> > > > > > > > 
> > > > > > > > struct X { int i; char c[4]; int j;};
> > > > > > > > struct Y { char c[16]; };
> > > > > > > > 
> > > > > > > > void foo (struct X *p, struct Y *q)
> > > > > > > > {
> > > > > > > >   memcpy (p, q, sizeof (struct Y));
> > > > > > > >   if (strlen ((char *)(struct Y *)p + 4) < 7)
> > > > > > > >     abort ();
> > > > > > > > }
> > > > > > > > 
> > > > > > > > here the GIMPLE IL looks like
> > > > > > > > 
> > > > > > > >   const char * _1;
> > > > > > > > 
> > > > > > > >   <bb 2> [local count: 1073741825]:
> > > > > > > >   _5 = MEM[(char * {ref-all})q_4(D)];
> > > > > > > >   MEM[(char * {ref-all})p_6(D)] = _5;
> > > > > > > >   _1 = p_6(D) + 4;
> > > > > > > >   _2 = __builtin_strlen (_1);
> > > > > > > > 
> > > > > > > > and I guess Martin would argue that since p is of type struct X
> > > > > > > > + 4 gets you to c[4] and thus strlen of that cannot be larger
> > > > > > > > than 3.  But of course the middle-end doesn't work like that
> > > > > > > > and luckily we do not try to draw such conclusions or we
> > > > > > > > are somehow lucky that for the testcase as written above we do
> > > not
> > > > > > > > (I'm not sure whether Martins changes in this area would derive
> > > > > > > > such conclusions in principle).
> > > > > > > 
> > > > > > > Only if the strlen argument were p->c.
> > > > > > > 
> > > > > > > > NOTE - we do not know the dynamic type here since we do not know
> > > > > > > > the dynamic type of the memory pointed-to by q!  We can only
> > > > > > > > derive that at q+4 there must be some object that we can
> > > > > > > > validly call strlen on (where Martin again thinks strlen
> > > > > > > > imposes constrains that memchr does not - sth I do not agree
> > > > > > > > with from a QOI perspective)
> > > > > > > 
> > > > > > > The dynamic type is a murky area.
> > > > > > 
> > > > > > It's well-specified in the middle-end.  A store changes the
> > > > > > dynamic type of the stored-to object.  If that type is
> > > > > > compatible with the surrounding objects dynamic type that one
> > > > > > is not affected, if not then the surrounding objects dynamic
> > > > > > type becomes unspecified.  There is TYPE_TYPELESS_STORAGE
> > > > > > to somewhat control "compatibility" of subobjects.
> > > > > 
> > > > > I never responded to this.  Using a dynamic (effective) type as
> > > > > you describe it would invalidate the aggressive loop optimization
> > > > > in the following:
> > > > > 
> > > > >   void foo (struct X *p)
> > > > >   {
> > > > >       struct Y y = { "12345678" };
> > > > >       memcpy (p, &y, sizeof (struct Y));
> > > > > 
> > > > >       // *p effective type is now struct Y
> > > > > 
> > > > >       int n = 0;
> > > > >       while (p->c[n])
> > > > >         ++n;
> > > > > 
> > > > >       if (n < 7)
> > > > >         abort ();
> > > > >   }
> > > > > 
> > > > > GCC unconditionally aborts, just as it does with strlen(p->c).
> > > > > Why is that not wrong (in either case)?
> > > > > 
> > > > > Because the code is invalid either way, for two reasons:
> > > > 
> > > > No, because the storage has only 4 non-null characters starting at
> > > offset 4?
> > > 
> > > No, for the reasons below.  I made a mistake of making
> > > the initializer string too short.  If we make it longer it
> > > still aborts.  Say with this
> > > 
> > >       struct Y y = { "123456789012345" };
> > > 
> > > we end up with this DCE:
> > > 
> > >  struct Y y;
> > > 
> > >   <bb 2> :
> > >   MEM[(char * {ref-all})p_6(D)] = 0x353433323130393837363534333231;
> > >   __builtin_abort ();
> > > 
> > > With -fdump-tree-cddce1-details (and a patch to show the upper
> > > bound) we see:
> > > 
> > >   Found loop 1 to be finite: upper bound found: 3.
> > > 
> > > With -fno-aggressive-loop-optimizations the abort becomes
> > > conditional because the array bound isn't considered. I would
> > > expect you to know this since you implemented the feature.
> > 
> > Honza added the array indexing part and it may very well be too aggressive.
> > I have to take a closer look after vacation to tell. Can you open a PR and
> > CC me there?
> 
> I opened bug 86884.

Now that I returned from vacation the testcase is simply bogus.  The
access p->c[n] requires an affective type of X.

Richard.

> Martin
> 
> > 
> > Richard.
> > 
> > > 
> > > Martin
> > > > 
> > > > > 1) it accesses an object of (effective) type struct Y via
> > > > >    an lvalue of type struct X (specifically via (*p).c)
> > > > > 2) it relies on p->c
> > > > > 
> > > > > The loop optimization relies on the exact same requirement
> > > > > as the strlen one.  Either they are both valid or neither is.
> > > > > 
> > > > > Martin
> > > > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]