This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

On 07/18/2016 03:59 PM, Martin Sebor wrote:

Unfortunately, linking with C_COMMON_OBJS isn't enough and linking
with C_OBJS doesn't work because of multiple definitions for symbols
like gt_ggc_mx_language_function.  I don't know this part of of GCC
but it seems that each front end gets a set of these generated
functions, some with the same names.  I spent a couple of hours
trying to get it to work but eventually gave up.
Correct. gt_ggc_mx_* are generated functions to support the garbage collection system. In this specific case, I think it's the GC bits for struct language_function which is specific to each front-end. So each front-end is going to have that function hence the multiple definitions.

The only option left to make it work with LTO is to extract the
checker from c-format.c and move it where it gets linked with lto1.
I think you're right.

To that end, the attached patch adds the checker under its own new
pass.  The pass runs very early without optimization, and relatively
late with it to benefit from the passes above.  With optimization
the pass also can (but doesn't yet) fold the return value from these
functions into a constant.  In the Linux kernel, it folds 88 snprintf
calls (I didn't find any instances where the whole call could be
folded into a string constant).
So I haven't dug into the patch yet, but what (if any) interaction is there between the early and late passes? Do they both emit warnings, or is it more that the early passe builds a candidate set which is refined by the later pass, or something totally different?

I tested the patch with LTO but due to bug 71907 no warnings are
issued.  Once the bug is resolved I'll re-test it and see about
adding test cases to verify it.
Note Richi checked in a patch a couple days ago to stream the abstract origins, which is supposed to fix 71907.

In the updated patch I tried to address the majority of everyone's
comments, including those I got from the bug submitter.  I also
made a number of enhancements based on the results I saw with the
Linux kernel and other code (e.g., support for %p, additional
heuristics to improve %s coverage, and support for POSIX numbered

Once a patch along these lines is approved and committed, I'd like
to enhance it by adding one or more of the following:

 *  -fdump-sprintf-length option to have the pass dump details
    about opportunities to fold expressions as well as instances
    where the checker was unable to check a call because of lack
    of object size or argument value or range information.
That'll definitely be useful as we've discussed in the past.

 *  Support for the return value folding (I have implemented and
    lightly tested thid but I would prefer to treat it it as
    a separate enhancement independent of this one).
Seems reasonable to me.

 *  If/when David's patch for on-demand locations within string
    literals is accepted and committed
    replace the location handling code I copied from c-format.c
    with the new API.
Also sounds good.

 *  Enhance the heuristics usesd to find likely %s argument
    lengths to improve the checker's coverage.

 *  Add support for %n and perhaps other functions (e.g., scanf).
Also good stuff.

Onward to looking at the patch :-)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]