This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)


On Sun, 2016-09-11 at 20:03 -0600, Martin Sebor wrote:
> On 09/08/2016 04:10 PM, Joseph Myers wrote:
> > On Thu, 8 Sep 2016, Martin Sebor wrote:
> > 
> > > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > > index da133a4..4607495 100644
> > > --- a/gcc/doc/tm.texi.in
> > > +++ b/gcc/doc/tm.texi.in
> > > @@ -4081,6 +4081,13 @@ In either case, it remains possible to
> > > select code-generation for the alternate
> > >   scheme, by means of compiler command line switches.
> > >   @end defmac
> > > 
> > > +@deftypefn {Target Hook} {const char *}
> > > TARGET_LIBC_PRINTF_POINTER_FORMAT (tree, const char **@var{flags}
> > > )
> > > +A hook to determine the target @code{printf} implementation
> > > format string
> > > +that the most closely corresponds to the @code{%p} format
> > > directive.
> > > +The object pointed to by the @var{flags} is set to a string
> > > consisting
> > > +of recognized format flags such as the @code{'#'} character.
> > > +@end deftypefn
> > 
> > No, the substance of hook documentation should go in target.def
> > with just
> > an @hook line in tm.texi.in leading to the documentation going in
> > tm.texi
> > automatically.
> > 
> > You appear to be defining a target macro masquerading as a hook. 
> >  Please
> > don't (new target macros should be avoided where possible); use a
> > proper
> > hook.  (Maybe the settings depending on OS rather than architecture
> > means
> > it needs to be one of those whose default is a manual setting in
> > target-def.h rather than automatically generated, but that should
> > be the
> > limit of deviation from the normal workings of hooks.)
> > 
> > > +  const char *pfmt = TARGET_LIBC_PRINTF_POINTER_FORMAT (arg,
> > > &flags);
> > 
> > With a proper hook them you'd call
> > targetm.libc_printf_pointer_format.
> > 
> > > +	inform (callloc,
> > > +		(nbytes + exact == 1
> > > +		 ? "format output %wu byte into a destination of
> > > size %wu"
> > > +		 : "format output %wu bytes into a destination
> > > of size %wu"),
> > > +		nbytes + exact, info.objsize);
> > 
> > You need to use G_() around both format strings in such a case;
> > xgettext
> > doesn't know how to extract them both.
> 
> Attached is revision 8 of the patch (hopefully) adjusted as
> requested above, and with a simple test with of the multiline
> diagnostic directives suggested by David.  This revision also
> enables the -fprintf-return-value option by default.  The libgomp
> test failures I was seeing in my earlier testing must have been
> caused by an older version of GMP or MPFR that I had inadvertently
> use (normally I use in-tree versions downloaded and expanded there
> by the download_prerequisites script but that time I forgot that
> step).
> 
> David, in the way of feedback, I spent hours debugging the simple
> multiline test I added.  It seems that DejaGnu is exquisitely
> sensitive to whitespaces in the multiline output.  I appreciate
> that spacing is essential to lining up the caret and the tildes
> with the text of the program but tests fail even when all the
> expected output is lined up right in the multiline directive but
> off by a space (or tab) with respect to the actual output.  That
> DejaGnu output doesn't make this very clear at all makes debugging
> these types of issues a pain.  I wonder if there's a better to do
> this.

Gah; I'm sorry this was such a pain to do.

I tend to just copy&paste the stuff I want to quote directly from
Emacs's compilation buffer.

However a nit, which I think is why you had problems...


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
new file mode 100644
index 0000000..a601ba4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-Wformat -Wformat-length=1 -fdiagnostics-show-caret -ftrack-macro-expansion=0" } */
+
+extern int sprintf (char*, const char*, ...);
+
+char dst [8];
+
+void test (void)
+{
+  sprintf (dst + 7, "%-s", "1");
+  /* { dg-warning "writing a terminating nul past the end of the destination" "" { target *-*-*-* } 10 }
+     { dg-message "format output 2 bytes into a destination of size 1" "" { target *-*-*-* } 10 }
+    { dg-begin-multiline-output "" }
+   sprintf (dst + 7, "%-s", "1");
+                     ^~~~~
+   sprintf (dst + 7, "%-s", "1");
+   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    { dg-end-multiline-output "" } */
+
+  sprintf (dst + 7, "%-s", "abcd");
+  /* { dg-warning ".%-s. directive writing 4 bytes into a region of size 1" "" { target *-*-*-* } 20 }
+     { dg-message "format output 5 bytes into a destination of size 1" "" { target *-*-*-* } 20 }
+    { dg-begin-multiline-output "" }
+   sprintf (dst + 7, "%-s", "abcd");
+                      ^~~   ~~~~~~
+   sprintf (dst + 7, "%-s", "abcd");
+   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    { dg-end-multiline-output "" } */
+}

You have two pairs of dg-begin/end-multiline-output, but I think it's
better to have four; use a single begin/end pair for each call to
diagnostic_show_locus.  Hopefully doing that ought to make things a bit
less sensitive to whitespace, and easier to debug: each begin/end can
be handled by itself, whereas by combining them it's harder for
multiline.exp to detect them.  If combined, they can only be detected
if all of the dg-warning/dg-messages work (and are thus pruned by
prune.exp), if any aren't pruned, the multiline outputs will also fail.
 Maybe this exacerbated the problem?

So something like this, grouping the 4 diagnostics together with their
diagnostic_show_locus output by opening and closing comments (line
numbers will need adjusting):

+void test (void)
+{
+  sprintf (dst + 7, "%-s", "1");
+  /* { dg-warning
"writing a terminating nul past the end of the destination" "" { target
*-*-*-* } 10 }
+     { dg-begin-multiline-output "" }
+   sprintf (dst +
7, "%-s", "1");
+                     ^~~~~
+     { dg-end-multiline
-output "" } */
+  /* { dg-message "format output 2 bytes into a
destination of size 1" "" { target *-*-*-* } 10 }
+     { dg-begin
-multiline-output "" }
+   sprintf (dst + 7, "%-s", "1");
+  
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+     { dg-end-multiline-output "" } */
+
+
  sprintf (dst + 7, "%-s", "abcd");
+  /* { dg-warning ".%-s. directive
writing 4 bytes into a region of size 1" "" { target *-*-*-* } 20 }
+   
  { dg-begin-multiline-output "" }
+   sprintf (dst + 7, "%-s", "abcd");

+                      ^~~   ~~~~~~
+     { dg-end-multiline-output "" }
*/
+  /* { dg-message "format output 5 bytes into a destination of size
1" "" { target *-*-*-* } 20 }
+     { dg-begin-multiline-output "" }
+  
 sprintf (dst + 7, "%-s", "abcd");
+   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
     { dg-end-multiline-output "" } */
+}

Another nit, if I may: FWIW I'm not in love with the wording of the messages.  Sorry to bikeshed, but how about:
  warning: buffer overflow will occur when writing terminating NUL
and:
  note: formatted output of 2 bytes into a destination of size 1
or somesuch.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]