This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fold more string comparison with known result (PR 90879)

On 8/13/19 3:43 PM, Martin Sebor wrote:
> On 8/13/19 2:07 PM, Jeff Law wrote:
>> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>> PR tree-optimization/90879 - fold zero-equality of strcmp between a
>>> longer string and a smaller array
>>> gcc/c-family/ChangeLog:
>>>     PR tree-optimization/90879
>>>     * c.opt (-Wstring-compare): New option.
>>> gcc/testsuite/ChangeLog:
>>>     PR tree-optimization/90879
>>>     * gcc.dg/Wstring-compare-2.c: New test.
>>>     * gcc.dg/Wstring-compare.c: New test.
>>>     * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>>>     * gcc.dg/strcmpopt_6.c: New test.
>>>     * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>>>     test cases.
>>>     * gcc.dg/strlenopt-66.c: Run it.
>>>     * gcc.dg/strlenopt-68.c: New test.
>>> gcc/ChangeLog:
>>>     PR tree-optimization/90879
>>>     * builtins.c (check_access): Avoid using maxbound when null.
>>>     * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen
>>> change.
>>>     * doc/invoke.texi (-Wstring-compare): Document new warning option.
>>>     * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>>>     conditional.
>>>     (get_range_strlen): Overwrite initial maxbound when non-null.
>>>     * gimple-ssa-sprintf.c (get_string_length): Adjust to
>>> get_range_strlen
>>>     change.
>>>     * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>>>     (used_only_for_zero_equality): New function.
>>>     (handle_builtin_memcmp): Call it.
>>>     (determine_min_objsize): Return an integer instead of tree.
>>>     (get_len_or_size, strxcmp_eqz_result): New functions.
>>>     (maybe_warn_pointless_strcmp): New function.
>>>     (handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>>>     between a longer string and a smaller array.
>>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>>> index 4af47855e7c..31e012b741b 100644
>>> --- a/gcc/tree-ssa-strlen.c
>>> +++ b/gcc/tree-ssa-strlen.c
>>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>>       type = TYPE_MAIN_VARIANT (type);
>>>   -  /* We cannot determine the size of the array if it's a flexible
>>> array,
>>> -     which is declared at the end of a structure.  */
>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>> -      && !array_at_struct_end_p (dest))
>>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>>> +     for arrays with more than one element, return the size of its
>>> +     type.  GCC itself misuses arrays of both zero and one elements
>>> +     as flexible array members so they are excluded as well.  */
>>> +  if (TREE_CODE (type) != ARRAY_TYPE
>>> +      || !array_at_struct_end_p (dest))
>>>       {
>>> -      tree size_t = TYPE_SIZE_UNIT (type);
>>> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
>>> -      && !integer_zerop (size_t))
>>> -        return size_t;
>>> +      tree type_size = TYPE_SIZE_UNIT (type);
>>> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
>>> +      && !integer_onep (type_size)
>>> +      && !integer_zerop (type_size))
>>> +        return tree_to_uhwi (type_size);
>> So I nearly commented on this when looking at the original patch.  Can
>> we really depend on the size when we've got an array at the end of a
>> struct with a declared size other than 0/1?   While 0/1 are by far the
>> most common way to declare them, couldn't someone have used other sizes?
>>   I think we pondered doing that at one time to cut down on the noise
>> from Coverity for RTL and TREE operand accessors.
>> Your code makes us safer, so I'm not saying you've done anything wrong,
>> just trying to decide if we need to tighten this up even further.
> This patch issues a warning in these cases, i.e., when it sees
> a call like, say, strcmp("foobar", A) with an A that's smaller
> than the string, because it seems they are likely (rare) bugs.
> I haven't seen the warning in any of the projects I tested it
> with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).
> The warning uses strcmp to detect these mistakes (or misuses)
> but I'd like to add similar warnings for other string functions
> as well and have code out there that does this on purpose use
> true flexible array members (or the zero-length extension)
> instead.  That makes the intent clear.
> It's a judgment call whether to also fold (or do something else
> like insert a trap) in addition to issuing a warning.  In this
> case (reading) I don't think it matters as much as it does for
> writes.  Either way, it would be nice to set a policy and
> document it in the manual so users know what to expect and
> so we don't have to revisit this question for each patch that
> touches on this subject.
The GCC manual documents zero length arrays at the end of an aggregate
as a GNU extension for variable length objects.  The manual also
documents that it could be done with single element arrays, but that
doing so does contribute to the base size of the aggregate, but
otherwise it's handled like a zero length array.

So both zero and one element arrays are documented as supported for this
use case.  However, I could easily see someone making the case that any
size should work here and I could easily think of cases where that would
be a reasonable thing to do.  We do not handle these cases in a
consistent way -- we'll treat sizes other than 0/1 as being a variable
length object in some cases, but not in others.

I'm tempted to bring consistency here.  We're likely not losing
significant diagnostic opportunities or optimizations if we treat all
trailing arrays as creating potentially variable sized objects.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]