This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Make strlen range computations more conservative

On 07/26/2018 02:55 AM, Richard Biener wrote:
On Wed, 25 Jul 2018, Martin Sebor wrote:

BUT - for the string_constant and c_strlen functions we are,
in all cases we return something interesting, able to look
at an initializer which then determines that type.  Hopefully.
I think the strlen() folding code when it sets SSA ranges
now looks at types ...?


struct X { int i; char c[4]; int j;};
struct Y { char c[16]; };

void foo (struct X *p, struct Y *q)
  memcpy (p, q, sizeof (struct Y));
  if (strlen ((char *)(struct Y *)p + 4) < 7)
    abort ();

here the GIMPLE IL looks like

  const char * _1;

  <bb 2> [local count: 1073741825]:
  _5 = MEM[(char * {ref-all})q_4(D)];
  MEM[(char * {ref-all})p_6(D)] = _5;
  _1 = p_6(D) + 4;
  _2 = __builtin_strlen (_1);

and I guess Martin would argue that since p is of type struct X
+ 4 gets you to c[4] and thus strlen of that cannot be larger
than 3.  But of course the middle-end doesn't work like that
and luckily we do not try to draw such conclusions or we
are somehow lucky that for the testcase as written above we do not
(I'm not sure whether Martins changes in this area would derive
such conclusions in principle).

Only if the strlen argument were p->c.

NOTE - we do not know the dynamic type here since we do not know
the dynamic type of the memory pointed-to by q!  We can only
derive that at q+4 there must be some object that we can
validly call strlen on (where Martin again thinks strlen
imposes constrains that memchr does not - sth I do not agree
with from a QOI perspective)

The dynamic type is a murky area.

It's well-specified in the middle-end.  A store changes the
dynamic type of the stored-to object.  If that type is
compatible with the surrounding objects dynamic type that one
is not affected, if not then the surrounding objects dynamic
type becomes unspecified.  There is TYPE_TYPELESS_STORAGE
to somewhat control "compatibility" of subobjects.

I never responded to this.  Using a dynamic (effective) type as
you describe it would invalidate the aggressive loop optimization
in the following:

  void foo (struct X *p)
      struct Y y = { "12345678" };
      memcpy (p, &y, sizeof (struct Y));

      // *p effective type is now struct Y

      int n = 0;
      while (p->c[n])

      if (n < 7)
        abort ();

GCC unconditionally aborts, just as it does with strlen(p->c).
Why is that not wrong (in either case)?

Because the code is invalid either way, for two reasons:

1) it accesses an object of (effective) type struct Y via
   an lvalue of type struct X (specifically via (*p).c)
2) it relies on p->c

The loop optimization relies on the exact same requirement
as the strlen one.  Either they are both valid or neither is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]