This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] enhance -Wrestrict for sprintf %s arguments
- From: Jeff Law <law at redhat dot com>
- To: Martin Sebor <msebor at gmail dot com>, Gcc Patch List <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 26 Jul 2017 13:30:09 -0600
- Subject: Re: [PATCH] enhance -Wrestrict for sprintf %s arguments
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=law at redhat dot com
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B083AE313D
- References: <f028c9d0-d24b-e9e6-bdd1-48fe65c641b3@gmail.com> <9c761e65-d8e4-1c99-a795-2934d4922e33@redhat.com> <28e232a7-fcfc-6c68-b42c-4c6741e84ede@gmail.com>
On 07/19/2017 10:10 AM, Martin Sebor wrote:
> On 07/19/2017 12:42 AM, Jeff Law wrote:
>> On 07/02/2017 02:00 PM, Martin Sebor wrote:
>>> The attached patch enhances the -Wrestrict warning to detect more
>>> than just trivial instances of overlapping copying by sprintf and
>>> related functions.
>>>
>>> The meat of the patch is relatively simple but because it introduces
>>> dependencies between existing classes in the sprintf pass I had to
>>> move the class definitions around. That makes the changes look more
>>> extensive than they really are.
>>>
>>> The enhancement works by first determining the base object (or
>>> pointer) from the destination of the sprintf call, the constant
>>> offset into the object, and its size. For each %s argument, it
>>> then computes same information. If it determines that overlap
>>> between the two is possible it stores the information for the
>>> directive argument (including the size of the argument) for later
>>> processing. After the whole call/format string has been processed,
>>> the patch then iterates over the stored directives and their
>>> arguments and compares the size/length of the argument against
>>> the function's overall output. If they overlap it issues
>>> a warning.
>>>
>>> Tested on x86_64-linux.
>>>
>>> -Wrestrict is not currently included in either -Wextra or -Wall
>>> and this patch doesn't change it even though there have been
>>> requests to add it to one of these two options. I'd like to do
>>> that as a separate step.
>> Yea, I think separate step is wise.
>>
>>>
>>> As the next step I'd also like to extend a limited form of the
>>> -Wrestrict enhancement to other restrict-qualified built-ins (like
>>> strcpy), and ultimately also to user-defined functions that make
>>> use of restrict. I think this might perhaps best be done in
>>> a separate pass where the computed pointer information can be
>>> cached to avoid recomputing it for each call, but I don't expect
>>> to be able to have the new pass (or whatever form the enhancement
>>> might end up taking) to also handle sprintf (at least not with
>>> the same accuracy it does now) because the sprintf data for each
>>> format directive are not available outside the sprintf pass.
>> Seems reasonable. Actual implementation will tell us for sure :-)
>>
>>
>>>
>>> Martin
>>>
>>> gcc-35503.diff
>>>
>>>
>>> PR tree-optimization/35503 - Warning about restricted pointers?
>>>
>>> gcc/c-family/ChangeLog:
>>>
>>> PR tree-optimization/35503
>>> * gcc/c-family/c-common.c (check_function_restrict): Avoid
>>> diagnosing
>>> sprintf et al. unless both -Wformat-overflow and -Wformat-truncation
>>> are disabled.
>>>
>>> gcc/ChangeLog:
>>>
>>> PR tree-optimization/35503
>>> * gimple-ssa-sprintf.c (format_result::alias_info): New struct.
>>> (directive::argno): New member.
>>> (format_result::aliases, format_result::alias_count): New data
>>> members.
>>> (format_result::append_alias): New member function.
>>> (fmtresult::dst_offset): New data member.
>>> (pass_sprintf_length::call_info::dst_origin): New data member.
>>> (pass_sprintf_length::call_info::dst_field, dst_offset): Same.
>>> (char_type_p, array_elt_at_offset, field_at_offset): New functions.
>>> (get_origin_and_offset): Same.
>>> (format_string): Call it.
>>> (format_directive): Call append_alias and set directive argument
>>> number.
>>> (pass_sprintf_length::compute_format_length): Diagnose arguments
>>> that overlap the destination buffer.
>>> (pass_sprintf_length::handle_gimple_call): Initialize new members.
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR tree-optimization/35503
>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-19.c: New test.
>> I'm OK with the general concept of enhancing the warning. The big
>> question I have is whether or not we'd be better off using the alias
>> oracle here rather than what appears to be rolling our own data
>> structures and analysis routines to describe memory objects and their
>> potential alias relationship.
>>
>> See tree-ssa-alias.h. In particular you're looking for ao_ref. You may
>> also be intersted in the points-to solutions. Would using that
>> infrastructure make sense?
>
> This patch make only limited use of the alias analysis which
> is in the current form overly broad (i.e., it answers the "may
> alias" question, not the "must alias" one). That makes it
> suitable for throttling optimization but not so much to trigger
> warnings. The other challenge is that the information the
> sprintf pass needs to detect overlaps is being computed in
> stages as each directive is being processed, and then again
> when the whole format string has been processed and the size
> of the full output is known.
I thought we had a must-alias query as well. Though it may not be
properly named :-) Hmm, perhaps not, DSE uses stmt_kills_ref_p which
seems to do most of the necessary checks inline.
>
> That said, it certainly makes sense to use the alias analysis
> to its full potential when possible. In the next step (posted
> for review a few days ago:
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00917.html)
On the list :-)
>
> [*] since I submitted the broader -Wrestrict patch I have found
> ways to further improve it too so I'll be submitting an updated
> version of it. I expect those improvements might also benefit
> the printf work. So in my mind this is evolving into a series
> of patches, each making a few incremental improvements over the
> last. Does this approach make sense to you?
Yea, let's get the basic stuff in and working and iterate on the
improvements.
>
> Martin
>
> PS this work has presented a little bit of a chicken and egg
> problem for me. I wrote this code with very little experience
> with the alias oracle in GCC, partly with the goal of learning
> more about it. I feel it definitely served that purpose and
> let me more easily implement the rest of the -Wrestrict
> enhancements and take better advantage of the existing
> machinery (and even enhance it where it falls short). With
> that, I think improving on this approach should be possible.
> In fact, I expect it to be possible to use these enhancement
> to improve other diagnostics besides -Wrestrict.
Excellent!
jeff