Warray-bounds-39 vs recent c_strlen changes

Martin Sebor msebor@gmail.com
Mon Aug 26 19:49:00 GMT 2019


On 8/26/19 11:41 AM, Jeff Law wrote:
> I think we need to go back and revisit this hunk:
> 
>>    /* If the offset is known to be out of bounds, warn, and call strlen at
>>       runtime.  */
>>    if (eltoff < 0 || eltoff >= maxelts)
>>      {
>>        /* Suppress multiple warnings for propagated constant strings.  */
>>        if (only_value != 2
>>            && !TREE_NO_WARNING (arg)
>>            && warning_at (loc, OPT_Warray_bounds,
>>                           "offset %qwi outside bounds of constant string",
>>                           eltoff))
>>          {
>>            if (decl)
>>              inform (DECL_SOURCE_LOCATION (decl), "%qE declared here", decl);
>>            TREE_NO_WARNING (arg) = 1;
>>          }
>>        return NULL_TREE;
>>      }
> 
> The problem we have is this is warning inside a utility routine that can
> be called many times.
> 
> ARG is often going to be an ADDR_EXPR that wraps a _DECL node.  So we'll
> end up setting TREE_NO_WARNING on the ADDR_EXPR node.  At first glance
> that seems fine, but it's really not sufficient to avoid duplicate warnings.
> 
> Consider that we might call get_range_strlen  for ARG again later in the
> optimizer pipeline.   That will call the overloaded get_range_strlen
> which has the VISITED/RKIND arguments via:
> 
>> bool
>> get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
>> {
>>    bitmap visited = NULL;
>>
>>    if (!get_range_strlen (arg, &visited, SRK_LENRANGE, pdata, eltsize))
> 
> That overload will call get_range_strlen_tree via:
> 
>> static bool
>> get_range_strlen (tree arg, bitmap *visited,
>>                    strlen_range_kind rkind,
>>                    c_strlen_data *pdata, unsigned eltsize)
>> {
>>
>>    if (TREE_CODE (arg) != SSA_NAME)
>>      return get_range_strlen_tree (arg, visited, rkind, pdata, eltsize);
> 
> get_range_strlen_tree in turn calls c_strlen which may return NULL which
> results in stripping the ADDR_EXPR and calling get_range_strlen again via:
> 
>>   else
>>      {
>>        c_strlen_data lendata = { };
>>        val = c_strlen (arg, 1, &lendata, eltsize);
>>
>>        if (!val && lendata.decl)
>>          {
>>            /* ARG refers to an unterminated const character array.
>>               DATA.DECL with size DATA.LEN.  */
>>            val = lendata.minlen;
>>            pdata->decl = lendata.decl;
>>          }
>>      }
>>
>>    if (!val && rkind == SRK_LENRANGE)
>>      {
>>        if (TREE_CODE (arg) == ADDR_EXPR)
>>          return get_range_strlen (TREE_OPERAND (arg, 0), visited, rkind,
>>                                   pdata, eltsize);
> 
> Note that we've stripped the ADDR_EXPR before the call to
> get_range_strlen here.  So we have no way to know we've already warned
> on this expression and boom it's trivial to get multiple warnings simply
> by asking for the range of a node.
> 
> This is a great example of why warning from folders and other routines
> that may be called more than once (directly or indirectly) is a bad idea.
> 
> 
> I think you need to find another place for this warning, preferably
> outside the folders and their utility routines.

I didn't add the warning, just a test that exercises it.  I don't
think it was being tested before.  AFAICS, the warning itself goes
back to r5379 from 1993.  The patch the bits above are quoted from
just let it detect a few more instances of out-of-bounds offsets
than it did before: in references to flexible members of declared
objects.

I agree that warning from low-level utility functions can lead
to duplicates or false positives.  Other utility functions deal
with it either by adding a flag to request a warning, or some
other parameter to let the caller know of the problem and let
it handle it.

The latter is what string_constant does, and c_strlen does it as
well for unterminated arrays (via the c_strlen_data structure).
One way to avoid the problem you described above is to extend
c_strlen_data to let c_strlen report the out-of-bounds offset
to the caller too.

Martin



More information about the Gcc-patches mailing list