[PATCH] Fix PR/8268: implement compile time array subscript checking
Richard Guenther
richard.guenther@gmail.com
Tue Dec 26 21:37:00 GMT 2006
On 12/21/06, Dirk Mueller <dmuell@gmx.net> wrote:
> On Tuesday, 12. December 2006 13:51, Richard Guenther wrote:
>
> > > Ok? Do we need the extra -Warray-bounds?
> > Yes, we need the extra -Warray-bounds for languages that don't include it
> > in -Wall and to turn the warning off.
>
> I don't really think it is a good idea to turn off the warning ;) The main
> reason I'm asking is because we'd need an extra flag for "maybe it overflows
> the array here" kind of warnings, as in my case I'd imagine that I want to
> Werror-out on the "always overflows" and only warn on the "maybe overflows"
> case.
I see. We still should keep the flag.
> > > NULL_TREE) + /* Accesses after the end of arrays of size 0 (gcc
> > > + extension) and 1 are likely intentional ("struct
> > > + hack"). */
> > > + || compare_tree_int (ub, 1) <= 0)
> > > + return;
> > The "struct hack" needs to include all arrays that are the last member
> > in a structure.
> > See tree-dfa.c:get_ref_base_and_extent - you can feed it the array_ref
> > and check if
> > *pmax_size is -1 (unbound or unknown). This function should also deal with
> > flexible arrays (i.e. report them as unknown extent).
>
> Hmm, it does that when it can determine that the array subscript is accessing
> the array beyond a memory location that is inside the struct the array ref
> was declared. It doesn't make sense to disable an array overflow warning for
> the case that it overflows an array more than the size of the struct, does
> it? There might be many legal cases suppressed this way. Do you know any real
> life code that would trigger false positives here? Or do you only want the
> struct hack to be enabled if the array is at the end of a structure?
Yes, I only want the struct hack to be enabled if the array is at the
end of a structure.
You can find all of struct s { ... int c[]; }, { ... int c[0]; }, {
... int c[1]; } and even
{ ... char c[4]; } (use all padding explicitly). The standard allows
treating all those
as flexible array members if they are last in the structure.
> > > + ru = TREE_CODE (us) == INTEGER_CST ? tree_int_cst_compare (ub, us) :
> > > -2; + rl = TREE_CODE (ls) == INTEGER_CST ? tree_int_cst_compare (ls, lb)
> > > : -2;
> > It looks like you can avoid adding 1 to ub by adjusting the checks
> > below to ru <= 0.
> > > + else if (ru == -1 || (!ignore_off_by_one && ru == 0))
> > > + warning (OPT_Warray_bounds, "%Harray subscript is above array
>
> No, because of the case quoted above. If the ARRAY_REF is inside an ADDR_EXPR,
> taking the address of the first element after the end of the array is
> welldefined.
Yes, but I was refering to the tree building (now you're using
int_const_binop which
also creates a new tree - at least now less frequently). Looking
again there doesn't
seem a way to avoid it.
> if you're saying that int_const_binop is computationally more expensive than
> two tree_int_cst_compare calls.. then I see your point. Ok, I've done some
> profiling and the change below seems to be in your spirit, and is about
> factor 2.5 faster. The total runtime overhead for insn-recog.c, which has more
> than 30000 array subscripts, is now about 0,1%. Much of the speedup however
> comes from compare_tree_int() which I had to change to be cheaper.
>
> Bootstrapped, regtested against trunk on i686-suse-linux.
>
> Thanks,
> Dirk
>
> 2006-12-21 Dirk Mueller <dmueller@suse.de>
> Richard Guenther <rguenther@suse.de>
>
> PR diagnostic/8268
> * doc/invoke.texi (Warray-bounds): Document -Warray-bounds.
> * common.opt (Warray-bounds): Add new warning option.
> * c-opts.c (c_common_handle_option): Define -Warray-bounds
> if -Wall is given.
> * tree-vrp.c (vrp_finalize): Call check_array_refs if -Warray-bounds
> is enabled.
> * tree.c (simple_cst_equal): Manually inline tree_int_cst_sgn.
You changed compare_tree_int, not simple_cst_equal.
Ok with that change.
Thanks,
Richard.
> (check_array_refs, check_array_bounds, check_array_ref): New.
>
> * testsuite/gcc.dg/Warray-bounds.c: New testcase.
>
>
> --- doc/invoke.texi (revision 119972)
> +++ doc/invoke.texi (working copy)
> @@ -245,7 +245,7 @@ Objective-C and Objective-C++ Dialects}.
> -Wreturn-type -Wsequence-point -Wshadow @gol
> -Wsign-compare -Wstack-protector @gol
> -Wstrict-aliasing -Wstrict-aliasing=2 @gol
> --Wstring-literal-comparison @gol
> +-Wstring-literal-comparison -Warray-bounds @gol
> -Wswitch -Wswitch-default -Wswitch-enum @gol
> -Wsystem-headers -Wtrigraphs -Wundef -Wuninitialized @gol
> -Wunknown-pragmas -Wno-pragmas -Wunreachable-code @gol
> @@ -2831,6 +2831,12 @@ compiler is using for optimization. Thi
> @option{-Wstrict-aliasing}, but it will also give a warning for some
> ambiguous
> cases that are safe.
>
> +@item -Warray-bounds
> +@opindex Warray-bounds
> +This option is only active when @option{-ftree-vrp} is active
> +(default for -O2 and above). It warns about subscripts to arrays
> +that are always out of bounds.
> +
> @item -Wall
> @opindex Wall
> All of the above @samp{-W} options combined. This enables all the
> --- common.opt (revision 119972)
> +++ common.opt (working copy)
> @@ -61,6 +61,10 @@ Walways-true
> Common Var(warn_always_true)
> Warn about comparisons that always evaluate to true
>
> +Warray-bounds
> +Common Var(warn_array_bounds)
> +Warn if an array is accessed out of bounds
> +
> Wattributes
> Common Var(warn_attributes) Init(1)
> Warn about inappropriate attribute usage
> --- tree.c (revision 119972)
> +++ tree.c (working copy)
> @@ -5003,7 +5003,7 @@ simple_cst_equal (tree t1, tree t2)
> int
> compare_tree_int (tree t, unsigned HOST_WIDE_INT u)
> {
> - if (tree_int_cst_sgn (t) < 0)
> + if (!TYPE_UNSIGNED (TREE_TYPE (t)) && TREE_INT_CST_HIGH (t) < 0)
> return -1;
> else if (TREE_INT_CST_HIGH (t) != 0)
> return 1;
> --- tree-vrp.c (revision 119972)
> +++ tree-vrp.c (working copy)
> @@ -32,6 +32,7 @@ Boston, MA 02110-1301, USA. */
> #include "tree-dump.h"
> #include "timevar.h"
> #include "diagnostic.h"
> +#include "toplev.h"
> #include "cfgloop.h"
> #include "tree-scalar-evolution.h"
> #include "tree-ssa-propagate.h"
> @@ -3420,6 +3420,142 @@ insert_range_assertions (void)
> BITMAP_FREE (need_assert_for);
> }
>
> +/* Checks one ARRAY_REF in REF, located at LOCUS. Ignores flexible arrays
> + and "struct" hacks. If VRP can determine that the
> + array subscript is a contant, check if it is outside valid
> + range. If the array subscript is a RANGE, warn if it is
> + non-overlapping with valid range.
> + IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR. */
> +
> +static void
> +check_array_ref (tree ref, location_t* locus, bool ignore_off_by_one)
> +{
> + value_range_t* vr = NULL;
> + tree low_sub, up_sub;
> + tree low_bound, up_bound = array_ref_up_bound(ref);
> +
> + low_sub = up_sub = TREE_OPERAND (ref, 1);
> +
> + if (!up_bound || !locus || TREE_NO_WARNING (ref)
> + || TREE_CODE (up_bound) != INTEGER_CST
> + /* Can not check flexible arrays. */
> + || (TYPE_SIZE (TREE_TYPE (ref)) == NULL_TREE
> + && TYPE_DOMAIN (TREE_TYPE (ref)) != NULL_TREE
> + && TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (ref))) == NULL_TREE)
> + /* Accesses after the end of arrays of size 0 (gcc
> + extension) and 1 are likely intentional ("struct
> + hack"). */
> + || compare_tree_int (up_bound, 1) <= 0)
> + return;
> +
> + low_bound = array_ref_low_bound(ref);
> +
> + if (TREE_CODE (low_sub) == SSA_NAME)
> + {
> + vr = get_value_range (low_sub);
> + if (vr->type == VR_RANGE || vr->type == VR_ANTI_RANGE)
> + {
> + low_sub = vr->type == VR_RANGE ? vr->max : vr->min;
> + up_sub = vr->type == VR_RANGE ? vr->min : vr->max;
> + }
> + }
> +
> + TREE_NO_WARNING (ref) = 1;
> +
> + if (vr && vr->type == VR_ANTI_RANGE)
> + {
> + if (TREE_CODE (up_sub) == INTEGER_CST
> + && tree_int_cst_lt (up_bound, up_sub)
> + && TREE_CODE (low_sub) == INTEGER_CST
> + && tree_int_cst_lt (low_sub, low_bound))
> + warning (OPT_Warray_bounds,
> + "%Harray subscript is outside array bounds", locus);
> + else
> + TREE_NO_WARNING (ref) = 0;
> + }
> + else if (TREE_CODE (up_sub) == INTEGER_CST
> + && tree_int_cst_lt (up_bound, up_sub)
> + && !tree_int_cst_equal (up_bound, up_sub)
> + && (!ignore_off_by_one
> + || !tree_int_cst_equal (int_const_binop (PLUS_EXPR,
> + up_bound,
> + integer_one_node,
> + 0),
> + up_sub)))
> + warning (OPT_Warray_bounds, "%Harray subscript is above array bounds",
> + locus);
> + else if (TREE_CODE (low_sub) == INTEGER_CST
> + && tree_int_cst_lt (low_sub, low_bound))
> + warning (OPT_Warray_bounds, "%Harray subscript is below array bounds",
> + locus);
> + else
> + TREE_NO_WARNING (ref) = 0;
> +}
> +
> +static bool array_bounds_already_done;
> +
> +/* walk_tree() callback that checks if *TP is
> + an ARRAY_REF inside an ADDR_EXPR (in which an array
> + subscript one outside the valid range is allowed). Call
> + check_array_ref for each ARRAY_REF found. The location is
> + passed in DATA. */
> +
> +static tree
> +check_array_bounds (tree *tp, int *walk_subtree, void *data)
> +{
> + tree t = *tp;
> +
> + *walk_subtree = TRUE;
> +
> + if (TREE_CODE (t) == ARRAY_REF)
> + check_array_ref (t, (location_t*) data, false /*ignore_off_by_one*/);
> + else if (TREE_CODE (t) == ADDR_EXPR)
> + {
> + t = TREE_OPERAND (t, 0);
> + while (!array_bounds_already_done && handled_component_p (t))
> + {
> + if (TREE_CODE (t) == ARRAY_REF)
> + check_array_ref (t, (location_t*) data,
> true /*ignore_off_by_one*/);
> + t = TREE_OPERAND (t, 0);
> + }
> + *walk_subtree = FALSE;
> + }
> +
> + return NULL_TREE;
> +}
> +
> +/* Walk over all statements of all reachable BBs and call check_array_bounds
> + on them. */
> +
> +static void
> +check_all_array_refs (void)
> +{
> + basic_block bb;
> + block_stmt_iterator si;
> +
> + FOR_EACH_BB (bb)
> + {
> + /* Skip bb's that are clearly unreachable. */
> + if (single_pred_p (bb))
> + {
> + basic_block pred_bb = EDGE_PRED (bb, 0)->src;
> + tree ls = NULL_TREE;
> +
> + if (!bsi_end_p (bsi_last (pred_bb)))
> + ls = bsi_stmt (bsi_last (pred_bb));
> +
> + if (ls && TREE_CODE (ls) == COND_EXPR
> + && ((COND_EXPR_COND (ls) == boolean_false_node
> + && (EDGE_PRED (bb, 0)->flags & EDGE_TRUE_VALUE))
> + || (COND_EXPR_COND (ls) == boolean_true_node
> + && (EDGE_PRED (bb, 0)->flags & EDGE_FALSE_VALUE))))
> + continue;
> + }
> + for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
> + walk_tree (bsi_stmt_ptr (si), check_array_bounds,
> + EXPR_LOCUS (*bsi_stmt_ptr (si)), NULL);
> + }
> +}
>
> /* Convert range assertion expressions into the implied copies and
> copy propagate away the copies. Doing the trivial copy propagation
> @@ -4733,6 +4869,19 @@ vrp_finalize (void)
>
> substitute_and_fold (single_val_range, true);
>
> + if (warn_array_bounds)
> + check_all_array_refs();
> +
> + /* workaround for PR/26726. The problem here is that -fivopts
> + sometimes shifts offsets so that arrays are accessed apparently
> + out of bounds, but actually are not. therefore we do not warn
> + about ARRAY_REF's inside ADDR_EXPR's anymore after the first run
> + (which is before ivopts).
> + */
> +
> + array_bounds_already_done = true;
> +
> +
> /* We must identify jump threading opportunities before we release
> the datastructures built by VRP. */
> identify_jump_threads ();
> Index: testsuite/gcc.dg/Warray-bounds.c
> ===================================================================
> --- testsuite/gcc.dg/Warray-bounds.c (revision 0)
> +++ testsuite/gcc.dg/Warray-bounds.c (revision 0)
> @@ -0,0 +1,92 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Warray-bounds" } */
> +
> +int a[10];
> +
> +static inline int n(void) {
> + __SIZE_TYPE__ strlen(const char *s);
> + return strlen("12345");
> +}
> +
> +void g(int *p);
> +void h(int p);
> +
> +int* f(void) {
> + int b[10];
> + int i;
> + struct {
> + int c[10];
> + } c;
> +
> + a[-1] = 0; /* { dg-warning "array subscript" } */
> + a[ 0] = 0;
> + a[ 1] = 0;
> +
> +
> + a[ 9] = 0;
> + a[10] = 0; /* { dg-warning "array subscript" } */
> + a[11] = 0; /* { dg-warning "array subscript" } */
> + a[2 * n() - 11] = 0; /* { dg-warning "array subscript" } */
> + a[2 * n() - 10] = 0;
> + a[2 * n() - 1] = 0;
> + a[2 * n() - 0] = 0; /* { dg-warning "array subscript" } */
> +
> + b[-1] = 0; /* { dg-warning "array subscript" } */
> + b[ 0] = 0;
> + b[ 1] = 0;
> + b[ 9] = 0;
> + b[10] = 0; /* { dg-warning "array subscript" } */
> + b[11] = 0; /* { dg-warning "array subscript" } */
> + b[2 * n() - 11] = 0; /* { dg-warning "array subscript" } */
> + b[2 * n() - 10] = 0;
> + b[2 * n() - 1] = 0;
> + b[2 * n() - 0] = 0; /* { dg-warning "array subscript" } */
> +
> + c.c[-1] = 0; /* { dg-warning "array subscript" } */
> + c.c[ 0] = 0;
> + c.c[ 1] = 0;
> + c.c[ 9] = 0;
> + c.c[10] = 0; /* { dg-warning "array subscript" } */
> + c.c[11] = 0; /* { dg-warning "array subscript" } */
> + c.c[2 * n() - 11] = 0; /* { dg-warning "array subscript" } */
> + c.c[2 * n() - 10] = 0;
> + c.c[2 * n() - 1] = 0;
> + c.c[2 * n() - 0] = 0; /* { dg-warning "array subscript" } */
> +
> + g(&a[8]);
> + g(&a[9]);
> + g(&a[10]);
> + g(&a[11]); /* { dg-warning "array subscript" } */
> +
> + g(&b[10]);
> + g(&c.c[10]);
> + g(&a[11]); /* { dg-warning "array subscript" } */
> + g(&b[11]); /* { dg-warning "array subscript" } */
> + g(&c.c[11]); /* { dg-warning "array subscript" } */
> +
> + g(&a[0]);
> + g(&b[0]);
> + g(&c.c[0]);
> +
> + g(&a[-1]); /* { dg-warning "array subscript" } */
> + g(&b[-1]); /* { dg-warning "array subscript" } */
> + h(sizeof a[-1]);
> + h(sizeof a[10]);
> + h(sizeof b[-1]);
> + h(sizeof b[10]);
> + h(sizeof c.c[-1]);
> + h(sizeof c.c[10]);
> +
> + if (10 < 10)
> + a[10] = 0;
> + if (10 < 10)
> + b[10] = 0;
> + if (-1 >= 0)
> + c.c[-1] = 0;
> +
> + for (i = 20; i < 30; ++i)
> + a[i] = 1; /* { dg-warning "array subscript" } */
> +
> + return a;
> +}
> +
>
>
More information about the Gcc-patches
mailing list