[PATCH] consider MIN_EXPR in get_size_range() (PR 85888)

Richard Biener richard.guenther@gmail.com
Fri May 25 07:16:00 GMT 2018


On Thu, May 24, 2018 at 11:22 PM Martin Sebor <msebor@gmail.com> wrote:

> On 05/24/2018 11:15 AM, Richard Biener wrote:
> > On May 24, 2018 7:02:17 PM GMT+02:00, Martin Sebor <msebor@gmail.com>
wrote:
> >> On 05/24/2018 03:39 AM, Richard Biener wrote:
> >>> On Thu, May 24, 2018 at 12:50 AM Martin Sebor <msebor@gmail.com>
> >> wrote:
> >>>
> >>>> The attached patch enhances the get_size_range() function to
> >>>> extract more accurate ranges out of MIN_EXPR and MAX_EXPR nodes
> >>>> with SSA_NAME operand(s).  This helps -Wstringop-overflow avoid
> >>>> false positives on some targets in cases like some of those
> >>>> reported in bug 85623 - strncmp() warns about attribute nonstring
> >>>> incorrectly in -Wstringop-overflow
> >>>
> >>>> When first trying to expand calls to __builtin_strncmp, most back
> >>>> ends succeed even when the expansion results in a call to the
> >> library
> >>>> function.  The powerpc64 back-end, however, fails and and allows
> >>>> the generic expansion to take place.  There is a subtle difference
> >>>> between the two cases that shows up when examining the bound of
> >>>> the strncmp call expression (the third argument): in the original
> >>>> call expression (in expand_builtin_strncmp) the bound is or can be
> >>>> an SSA_NAME.  But when the early expansion fails,
> >>>> expand_builtin_strncmp transforms the original call expression into
> >>>> a new one, substituting MIN_EXPR (bound, CST) for the bound where
> >>>> CST is the constant result of strlen (STR), with STR being either
> >>>> the first or second strncmp argument.  When the bound is an SSA_NAME
> >>>> the subsequent attempt to determine its range from the MIN_EXPR
> >> fails.
> >>>
> >>> Hmm, wouldn't sth extracted from the attached random patch I have
> >> lying
> >>> around be more useful in general?  That is, refactor the GENERIC
> >>> expression walk in determine_value_range_1 to sth that can be used
> >>> in the current get_range_info () interface?  Or if we don't want to
> >> change
> >>> that to accept non-SSA names add a corresponding get_expr_range_info
> >> ()
> >>> interface.
> >>
> >> Thanks.  It is certainly more general.  I'm just not sure how much
> >> use the generality can be put to because...
> >>
> >>> If you do that please cut out the dominator/loop condition handling
> >> and
> >>> refrain from the idea of looking at SSA def stmts.
> >>
> >> ...I've done that but I'm having trouble exercising the code except
> >> for this bug.  (I've run the C testsuite but the only tests that end
> >> up in it call it with uninteresting arguments like VAR_DECL).  Do
> >> you have any suggestions?
> >
> > Well, what builds the MIN_EXPR for your testcase? I orinigally used it
for niters analysis which deals with complex GENERIC expressions. If you
just changed get_range_info then of course callers are guarded with SSA
name checks.

> In pr85888 it's built by expand_builtin_strncmp (so very late).
> It doesn't seem like a good use case because of this code:

>     /* Expand the library call ourselves using a stabilized argument
>        list to avoid re-evaluating the function's arguments twice.  */
>     tree fn = build_call_nofold_loc (loc, fndecl, 3, arg1, arg2, len);
>     gcc_assert (TREE_CODE (fn) == CALL_EXPR);
>     CALL_EXPR_TAILCALL (fn) = CALL_EXPR_TAILCALL (exp);
>     return expand_call (fn, target, target == const0_rtx);

> AFAICS, the new call expression is the same as the old one, the only
> difference is the LEN argument which is the MIN_EXPR.

> The code was added in r72380 where the stabilization referred to
> calling save_expr() on the arguments.  That code has been gone
> since r242556 so I think a better/more targeted solution to fix
> pr85888 than either of our approaches might be to simply call
> expand_call() on the original CALL_EXPR with the original
> arguments, including ARG3.

> I don't fully understand the point of the save_expr() calls there
> (and they're still in builtin_expand_strcmp), so if they need to
> be restored then they would presumably include ARG3.  I.e.,
> expand_call() would be called using ARG3 in the original SSA_NAME
> form rather than the MIN_EXPR wrapper created by the function.

> Attached is a patch that implements the first approach above.

> If the SAVE_EXPRs are needed for some reason, it would be good
> to have a test case to verify it.  I haven't been able to come
> up with one (or find an existing test that fails due to their
> removal) but that could very well be because my limited
> understanding of what they're for (and poor testsuite coverage).

> I'm also up for taking your patch and trying to make use of it
> for other trees where get_range_info() is currently being called
> for SSA_NAMEs only.  But it feels like a separate project from
> this bug fix.

But your patch makes the whole 'len' computation after

   /* If we don't have a constant length for the first, use the length
      of the second, if we know it.  If neither string is constant length,
      use the given length argument.  We don't require a constant for
      this case; some cost analysis could be done if both are available
      but neither is constant.  For now, assume they're equally cheap,
      unless one has side effects.  If both strings have constant lengths,
      use the smaller.  */

   if (!len1 && !len2)
     len = len3;
   else if (!len1)
...
   /* If we are not using the given length, we must incorporate it here.
      The actual new length parameter will be MIN(len,arg3) in this case.  */
   if (len != len3)
     len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len, len3);

dead if expand_cmpstrn_or_cmpmem doesn't expand things inline.
I believe the idea is to possibly reduce 'len' for expansion of the
libcall.

OTOH all this "optimization" should have been done as folding
and not at RTL expansion time.  Thus if the issue is getting
a warning from the recursive re-expansion of the adjusted
call then removing this "postmature" optimization and making
sure it happens earlier as folding might be indeed a good thing.

But your patch as-is looks like it loses some optimization
(and I'm not caring about "folding" at -O0).

Richard.

> Martin

> >
> >> Martin
> >>
> >> PS I was able to create a test that at -O0 could call it with
> >> a more interesting argument such as:
> >>
> >>   const __SIZE_TYPE__ i = 3, j = 5;
> >>
> >>   int f (void)
> >>   {
> >>     return __builtin_strncmp ("1234", "123", i < j ? i : j);
> >>   }
> >>
> >> Here, the first call to gimple_fold_builtin_string_compare() that
> >> I picked as a test case sees:
> >>
> >>   j.0_1 = 5;
> >>   i.1_2 = 3;
> >>   _3 = MIN_EXPR <j.0_1, i.1_2>;
> >>   D.1963 = __builtin_strncmp ("1234", "123", _3);
> >>
> >> Above, _3's range info is not available yet at this point so
> >> the expression could be evaluated but only by looking at def stmts.
> >> Why is it not a good idea?
> >
> > Because it raises complexity concerns unless you also populate
intermediate SSA range Infos and stop at present ones. And you can't deal
with conditional info as the patch tries to do (but I wouldn't keep that
either initially).
> >
> > Richard.
> >
> >>
> >> Martin
> >



More information about the Gcc-patches mailing list