This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)
On 07/18/2016 03:59 PM, Martin Sebor wrote:
Correct. gt_ggc_mx_* are generated functions to support the garbage
collection system. In this specific case, I think it's the GC bits for
struct language_function which is specific to each front-end. So each
front-end is going to have that function hence the multiple definitions.
Unfortunately, linking with C_COMMON_OBJS isn't enough and linking
with C_OBJS doesn't work because of multiple definitions for symbols
like gt_ggc_mx_language_function. I don't know this part of of GCC
but it seems that each front end gets a set of these generated
functions, some with the same names. I spent a couple of hours
trying to get it to work but eventually gave up.
The only option left to make it work with LTO is to extract the
checker from c-format.c and move it where it gets linked with lto1.
I think you're right.
So I haven't dug into the patch yet, but what (if any) interaction is
there between the early and late passes? Do they both emit warnings, or
is it more that the early passe builds a candidate set which is refined
by the later pass, or something totally different?
To that end, the attached patch adds the checker under its own new
pass. The pass runs very early without optimization, and relatively
late with it to benefit from the passes above. With optimization
the pass also can (but doesn't yet) fold the return value from these
functions into a constant. In the Linux kernel, it folds 88 snprintf
calls (I didn't find any instances where the whole call could be
folded into a string constant).
Note Richi checked in a patch a couple days ago to stream the abstract
origins, which is supposed to fix 71907.
I tested the patch with LTO but due to bug 71907 no warnings are
issued. Once the bug is resolved I'll re-test it and see about
adding test cases to verify it.
In the updated patch I tried to address the majority of everyone's
comments, including those I got from the bug submitter. I also
made a number of enhancements based on the results I saw with the
Linux kernel and other code (e.g., support for %p, additional
heuristics to improve %s coverage, and support for POSIX numbered
Once a patch along these lines is approved and committed, I'd like
to enhance it by adding one or more of the following:
* -fdump-sprintf-length option to have the pass dump details
about opportunities to fold expressions as well as instances
where the checker was unable to check a call because of lack
of object size or argument value or range information.
That'll definitely be useful as we've discussed in the past.
* Support for the return value folding (I have implemented and
lightly tested thid but I would prefer to treat it it as
a separate enhancement independent of this one).
Seems reasonable to me.
* If/when David's patch for on-demand locations within string
literals is accepted and committed
replace the location handling code I copied from c-format.c
with the new API.
Also sounds good.
* Enhance the heuristics usesd to find likely %s argument
lengths to improve the checker's coverage.
* Add support for %n and perhaps other functions (e.g., scanf).
Also good stuff.
Onward to looking at the patch :-)