This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)


Thanks for looking at this!  I realize it's dense code and not easy
to make sense out of.

PR middle-end/78622 - [7 Regression]
-Wformat-length/-fprintf-return-value incorrect with overflow/wrapping

gcc/ChangeLog:

    PR middle-end/78622
    * gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange
    rather than res.bounded.
    (adjust_range_for_overflow): New function.
    (format_integer): Always set res.bounded to true unless either
    precision or width is specified and unknown.
    Call adjust_range_for_overflow.
    (format_directive): Remove vestigial quoting.
    (add_bytes): Correct the computation of boundrange used to
    decide whether a warning is of a "maybe" or "defnitely" kind.
s/defnitely/definitely/

Will fix.

+/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual
+   argument, due to the conversion from either *ARGMIN or *ARGMAX to
+   the type of the directive's formal argument it's possible for both
+   to result in the same number of bytes or a range of bytes that's
+   less than the number of bytes that would result from formatting
+   some other value in the range [*ARGMIN, *ARGMAX].  This can be
+   determined by checking for the actual argument being in the range
+   of the type of the directive.  If it isn't it must be assumed to
+   take on the full range of the directive's type.
+   Return true when the range has been adjusted to the range of
+   DIRTYPE, false otherwise.  */
I wish I'd counted how many times I read that.

Let me see if I can word it better.


+
+static bool
+adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
+{
+  unsigned argprec = TYPE_PRECISION (TREE_TYPE (*argmin));
+  unsigned dirprec = TYPE_PRECISION (dirtype);
+
+  /* The logic here was inspired/lifted from  the CONVERT_EXPR_CODE_P
+     branch in the extract_range_from_unary_expr function in tree-vrp.c.
+  */
Formatting it.  If the comment-close won't fit, then line wrap prior to
the last word in the comment.

Okay.  I didn't remember what the exact rules were.

+
+  if (TREE_CODE (*argmin) == INTEGER_CST
+      && TREE_CODE (*argmax) == INTEGER_CST
+      && (dirprec >= argprec
+      || integer_zerop (int_const_binop (RSHIFT_EXPR,
+                         int_const_binop (MINUS_EXPR,
+                                  *argmax,
+                                  *argmin),
+                         size_int (dirprec)))))
+  {
+    *argmin = force_fit_type (dirtype, wi::to_widest (*argmin), 0,
false);
+    *argmax = force_fit_type (dirtype, wi::to_widest (*argmax), 0,
false);
+    return false;
+  }
So in this case we're not changing the values, we're just converting
them to a equal or wider type, right?  Thus no adjustment (what about a
sign change?  Is that an adjustment?) and we return false per the
function's specifications.

This casts the value to the destination type, so it may result in
different values.  The important postcondition it preserves is that
the difference ARGMAX - ARGMIN is less than or equal to TYPE_MAX of
the directive's unsigned TYPE.  (If it isn't, the range cannot be
converted.)  This lets us take, say:

  int f (int i)
  {
    if (i < 1024 || 1033 < i)
       i = 1024;

    return snprintf (0, 0, "%hhi", i);
  }

and fold the function call into 1 because (signed char)1024 is 0 and
(signed char)1033 is 9, and every other value in that same original
range is also in the same range after the conversion.  We know it's
safe because (1033 - 1024 <= UCHAR_MAX) holds.  But in in this:

  int g (int i)
  {
    if (i < 1024 || 1289 < i)
       i = 1024;

    return snprintf (0, 0, "%hhi", i);
  }

even though (signed char)1024 is 0 and (signed char)1289 is also 9
like above, since (1289 - 1024) is 265 and thus greater than UCHAR_MAX,
folding the result wouldn't be safe (for example, (sighed char)1034
yields 10).

I picture this as a window bounded by the range of the directive's
type sliding within another window bounded by the argument's type:

  [<-----[...dirtype...]--->]

the outer brackets delimit the range of the argument and the inner
ones the directive's.  The smaller window can slide left and right
and it can shrink but it can't be wider that the directive's type's
range.

What about overflows when either argmin or argmax won't fit into dirtype
without an overflow?  What's the right behavior there?

That's fine just as long as the property above holds.

I think the algorithm works but the code came from tree-vrp where
there are all sorts of additional checks for some infinities which
VRP distinguishes from type limits like SCHAR_MIN and _MAX.  I don't
fully understand those and so I'd feel better if someone who does
double checked this code.


+
+  tree dirmin = TYPE_MIN_VALUE (dirtype);
+  tree dirmax = TYPE_MAX_VALUE (dirtype);
From this point onward argmin/argmax are either not integers or we're
doing a narrowing conversion, right?  In both cases the fact that we're
doing a narrowing conversion constrains the range

Argmin and argmax are always integers.  The rest of the function
handles the case where the postcondition above doesn't hold (e.g.,
in function g above).


+
+  if (TYPE_UNSIGNED (dirtype))
+    {
+      *argmin = dirmin;
+      *argmax = dirmax;
+    }
+  else
+    {
+      *argmin = integer_zero_node;
+      *argmax = dirmin;
+    }
Should this be dirmax? Am I missing something here?

It's dirmin because for a signed type, due to the sign, it results
in more bytes on output than dirmin would.  (E.g., with SCHAR_MIN
= -128 and SCHAR_MAX = 127 it's four bytes vs three.)

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]