This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)
- From: David Malcolm <dmalcolm at redhat dot com>
- To: Martin Sebor <msebor at gmail dot com>, Gcc Patch List <gcc-patches at gcc dot gnu dot org>
- Cc: Joseph Myers <joseph at codesourcery dot com>, Jeff Law <law at redhat dot com>, Richard Biener <rguenther at suse dot de>, Jakub Jelinek <jakub at redhat dot com>, Bernd Schmidt <bschmidt at redhat dot com>, Manuel López-Ibáñez <lopezibanez at gmail dot com>, Florian Weimer <fweimer at redhat dot com>
- Date: Fri, 16 Sep 2016 14:24:11 -0400
- Subject: Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)
- Authentication-results: sourceware.org; auth=none
- References: <5776B33E.2080504@gmail.com> <alpine.LSU.2.11.1607041255060.29772@t29.fhfr.qr> <577A8D6A.3070902@gmail.com> <alpine.LSU.2.11.1607051208470.29772@t29.fhfr.qr> <578D512F.9050909@gmail.com> <9bb5ad66-4985-8c42-f800-4d84e0e18659@redhat.com> <57A3AFFF.7090109@gmail.com> <f3d09350-de5d-a0e4-8203-affac268ced2@redhat.com> <57AD30E5.3000801@gmail.com> <22a47656-c23c-4840-2e49-a59f4af513b1@redhat.com> <57B725F6.8000405@gmail.com> <110cfc6b-7856-9b51-885f-05402b14fc3e@redhat.com> <57D1B5F0.1030504@gmail.com> <alpine.DEB.2.20.1609082201120.17041@digraph.polyomino.org.uk> <57D60D03.7080601@gmail.com>
On Sun, 2016-09-11 at 20:03 -0600, Martin Sebor wrote:
> On 09/08/2016 04:10 PM, Joseph Myers wrote:
> > On Thu, 8 Sep 2016, Martin Sebor wrote:
> >
> > > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > > index da133a4..4607495 100644
> > > --- a/gcc/doc/tm.texi.in
> > > +++ b/gcc/doc/tm.texi.in
> > > @@ -4081,6 +4081,13 @@ In either case, it remains possible to
> > > select code-generation for the alternate
> > > scheme, by means of compiler command line switches.
> > > @end defmac
> > >
> > > +@deftypefn {Target Hook} {const char *}
> > > TARGET_LIBC_PRINTF_POINTER_FORMAT (tree, const char **@var{flags}
> > > )
> > > +A hook to determine the target @code{printf} implementation
> > > format string
> > > +that the most closely corresponds to the @code{%p} format
> > > directive.
> > > +The object pointed to by the @var{flags} is set to a string
> > > consisting
> > > +of recognized format flags such as the @code{'#'} character.
> > > +@end deftypefn
> >
> > No, the substance of hook documentation should go in target.def
> > with just
> > an @hook line in tm.texi.in leading to the documentation going in
> > tm.texi
> > automatically.
> >
> > You appear to be defining a target macro masquerading as a hook.
> > Please
> > don't (new target macros should be avoided where possible); use a
> > proper
> > hook. (Maybe the settings depending on OS rather than architecture
> > means
> > it needs to be one of those whose default is a manual setting in
> > target-def.h rather than automatically generated, but that should
> > be the
> > limit of deviation from the normal workings of hooks.)
> >
> > > + const char *pfmt = TARGET_LIBC_PRINTF_POINTER_FORMAT (arg,
> > > &flags);
> >
> > With a proper hook them you'd call
> > targetm.libc_printf_pointer_format.
> >
> > > + inform (callloc,
> > > + (nbytes + exact == 1
> > > + ? "format output %wu byte into a destination of
> > > size %wu"
> > > + : "format output %wu bytes into a destination
> > > of size %wu"),
> > > + nbytes + exact, info.objsize);
> >
> > You need to use G_() around both format strings in such a case;
> > xgettext
> > doesn't know how to extract them both.
>
> Attached is revision 8 of the patch (hopefully) adjusted as
> requested above, and with a simple test with of the multiline
> diagnostic directives suggested by David. This revision also
> enables the -fprintf-return-value option by default. The libgomp
> test failures I was seeing in my earlier testing must have been
> caused by an older version of GMP or MPFR that I had inadvertently
> use (normally I use in-tree versions downloaded and expanded there
> by the download_prerequisites script but that time I forgot that
> step).
>
> David, in the way of feedback, I spent hours debugging the simple
> multiline test I added. It seems that DejaGnu is exquisitely
> sensitive to whitespaces in the multiline output. I appreciate
> that spacing is essential to lining up the caret and the tildes
> with the text of the program but tests fail even when all the
> expected output is lined up right in the multiline directive but
> off by a space (or tab) with respect to the actual output. That
> DejaGnu output doesn't make this very clear at all makes debugging
> these types of issues a pain. I wonder if there's a better to do
> this.
Gah; I'm sorry this was such a pain to do.
I tend to just copy&paste the stuff I want to quote directly from
Emacs's compilation buffer.
However a nit, which I think is why you had problems...
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
new file mode 100644
index 0000000..a601ba4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-Wformat -Wformat-length=1 -fdiagnostics-show-caret -ftrack-macro-expansion=0" } */
+
+extern int sprintf (char*, const char*, ...);
+
+char dst [8];
+
+void test (void)
+{
+ sprintf (dst + 7, "%-s", "1");
+ /* { dg-warning "writing a terminating nul past the end of the destination" "" { target *-*-*-* } 10 }
+ { dg-message "format output 2 bytes into a destination of size 1" "" { target *-*-*-* } 10 }
+ { dg-begin-multiline-output "" }
+ sprintf (dst + 7, "%-s", "1");
+ ^~~~~
+ sprintf (dst + 7, "%-s", "1");
+ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ { dg-end-multiline-output "" } */
+
+ sprintf (dst + 7, "%-s", "abcd");
+ /* { dg-warning ".%-s. directive writing 4 bytes into a region of size 1" "" { target *-*-*-* } 20 }
+ { dg-message "format output 5 bytes into a destination of size 1" "" { target *-*-*-* } 20 }
+ { dg-begin-multiline-output "" }
+ sprintf (dst + 7, "%-s", "abcd");
+ ^~~ ~~~~~~
+ sprintf (dst + 7, "%-s", "abcd");
+ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ { dg-end-multiline-output "" } */
+}
You have two pairs of dg-begin/end-multiline-output, but I think it's
better to have four; use a single begin/end pair for each call to
diagnostic_show_locus. Hopefully doing that ought to make things a bit
less sensitive to whitespace, and easier to debug: each begin/end can
be handled by itself, whereas by combining them it's harder for
multiline.exp to detect them. If combined, they can only be detected
if all of the dg-warning/dg-messages work (and are thus pruned by
prune.exp), if any aren't pruned, the multiline outputs will also fail.
Maybe this exacerbated the problem?
So something like this, grouping the 4 diagnostics together with their
diagnostic_show_locus output by opening and closing comments (line
numbers will need adjusting):
+void test (void)
+{
+ sprintf (dst + 7, "%-s", "1");
+ /* { dg-warning
"writing a terminating nul past the end of the destination" "" { target
*-*-*-* } 10 }
+ { dg-begin-multiline-output "" }
+ sprintf (dst +
7, "%-s", "1");
+ ^~~~~
+ { dg-end-multiline
-output "" } */
+ /* { dg-message "format output 2 bytes into a
destination of size 1" "" { target *-*-*-* } 10 }
+ { dg-begin
-multiline-output "" }
+ sprintf (dst + 7, "%-s", "1");
+
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ { dg-end-multiline-output "" } */
+
+
sprintf (dst + 7, "%-s", "abcd");
+ /* { dg-warning ".%-s. directive
writing 4 bytes into a region of size 1" "" { target *-*-*-* } 20 }
+
{ dg-begin-multiline-output "" }
+ sprintf (dst + 7, "%-s", "abcd");
+ ^~~ ~~~~~~
+ { dg-end-multiline-output "" }
*/
+ /* { dg-message "format output 5 bytes into a destination of size
1" "" { target *-*-*-* } 20 }
+ { dg-begin-multiline-output "" }
+
sprintf (dst + 7, "%-s", "abcd");
+ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
{ dg-end-multiline-output "" } */
+}
Another nit, if I may: FWIW I'm not in love with the wording of the messages. Sorry to bikeshed, but how about:
warning: buffer overflow will occur when writing terminating NUL
and:
note: formatted output of 2 bytes into a destination of size 1
or somesuch.