[PATCH] warn on returning alloca and VLA (PR 71924, 90549)

Jeff Law law@redhat.com
Fri May 31 21:19:00 GMT 2019


On 5/22/19 3:34 PM, Martin Sebor wrote:
> -Wreturn-local-addr detects a subset of instances of returning
> the address of a local object from a function but the warning
> doesn't try to handle alloca or VLAs, or some non-trivial cases
> of ordinary automatic variables[1].
> 
> The attached patch extends the implementation of the warning to
> detect those.  It still doesn't detect instances where the address
> is the result of a built-in such strcpy[2].
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> [1] For example, this is only diagnosed with the patch:
> 
>   void* f (int i)
>   {
>     struct S { int a[2]; } s[2];
>     return &s->a[i];
>   }
> 
> [2] The following is not diagnosed even with the patch:
> 
>   void sink (void*);
> 
>   void* f (int i)
>   {
>     char a[6];
>     char *p = __builtin_strcpy (a, "123");
>     sink (p);
>     return p;
>   }
> 
> I would expect detecting to be possible and useful.  Maybe as
> a follow-up.
> 
> gcc-71924.diff
> 
> PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result
> PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset
> 
> gcc/ChangeLog:
> 
> 	PR c/71924
> 	* gimple-ssa-isolate-paths.c (is_addr_local): New function.
> 	(warn_return_addr_local_phi_arg, warn_return_addr_local): Same.
> 	(find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg.
> 	(find_explicit_erroneous_behavior): Call warn_return_addr_local.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR c/71924
> 	* gcc.dg/Wreturn-local-addr-2.c: New test.
> 	* gcc.dg/Walloca-4.c: Prune expected warnings.
> 	* gcc.dg/pr41551.c: Same.
> 	* gcc.dg/pr59523.c: Same.
> 	* gcc.dg/tree-ssa/pr88775-2.c: Same.
> 	* gcc.dg/winline-7.c: Same.
> 
> diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
> index 33fe352bb23..2933ecf502e 100644
> --- a/gcc/gimple-ssa-isolate-paths.c
> +++ b/gcc/gimple-ssa-isolate-paths.c
> @@ -341,6 +341,135 @@ stmt_uses_0_or_null_in_undefined_way (gimple *stmt)
>    return false;
>  }
>  
> +/* Return true if EXPR is a expression of pointer type that refers
> +   to the address of a variable with automatic storage duration.
> +   If so, set *PLOC to the location of the object or the call that
> +   allocated it (for alloca and VLAs).  When PMAYBE is non-null,
> +   also consider PHI statements and set *PMAYBE when some but not
> +   all arguments of such statements refer to local variables, and
> +   to clear it otherwise.  */
> +
> +static bool
> +is_addr_local (tree exp, location_t *ploc, bool *pmaybe = NULL,
> +	       hash_set<gphi *> *visited = NULL)
> +{
> +  if (TREE_CODE (exp) == SSA_NAME)
> +    {
> +      gimple *def_stmt = SSA_NAME_DEF_STMT (exp);
> +      enum gimple_code code = gimple_code (def_stmt);
> +
> +      if (is_gimple_assign (def_stmt))
> +	{
> +	  tree type = TREE_TYPE (gimple_assign_lhs (def_stmt));
> +	  if (POINTER_TYPE_P (type))
> +	    {
> +	      tree ptr = gimple_assign_rhs1 (def_stmt);
> +	      return is_addr_local (ptr, ploc, pmaybe, visited);
> +	    }
> +	  return false;
> +	}
So this is going to recurse on the rhs1 of something like
POINTER_PLUS_EXPR, that's a good thing :-)   But isn't it non-selective
about the codes where we recurse?

Consider

  ptr = (cond) ? res1 : res2

I think we'll end up recursing on the condition rather than looking at
res1 and res2.


I suspect there are a very limited number of expression codes that
appear on the RHS where we'd want to recurse on one or both operands.

POINTER_PLUS_EXPR, NOP_EXPR, maybe COND_EXPR (where you have to recurse
on both and logically and the result), BIT_AND (maybe we masked off some
bits in an address).  That's probably about it :-)

Are there any other codes you've seen or think would be useful in
practice to recurse through?  I'd rather list them explicitly rather
than just recurse down through every rhs1 we encounter.



> +
> +      if (code == GIMPLE_PHI && pmaybe)
> +	{
> +	  unsigned count = 0;
> +	  gphi *phi_stmt = as_a <gphi *> (def_stmt);
> +
> +	  unsigned nargs = gimple_phi_num_args (phi_stmt);
> +	  for (unsigned i = 0; i < nargs; ++i)
> +	    {
> +	      if (!visited->add (phi_stmt))
> +		{
> +		  tree arg = gimple_phi_arg_def (phi_stmt, i);
> +		  if (is_addr_local (arg, ploc, pmaybe, visited))
> +		    ++count;
> +		}
> +	    }
So imagine

p = phi (x1, x1, x2)

Where both x1 and x2 would pass is_addr_local.  I think this code would
incorrectly set pmaybe.

We process the first instance of x1, find it is local and bump count.
When we encounter the second instance, it's in our map and we do
nothing.  THen we process x2 and bump count again.  So count would be 2.
 But count is going to be less than nargs so we'll set *pmaybe to true.

ISTM you'd have to record the result for each phi argument and query
that to determine if you should bump the counter if you've already
visited the argument.

It also seems to me that you can break the loop as soon as you've got a
nonzero count and get a false return from is_addr_local.   Not sure if
that'll matter in practice or not.

The other approach here (and I'm not suggesting you implement it) would
be to use the propagation engine.  That's probably overkill here since
we'd probably end up computing a whole lot of things we don't need.  My
sense is an on-demand approach like you've done is probably less
computationally expensive.

jeff



More information about the Gcc-patches mailing list