Bug 77696 - Confusing wording for -Wformat-overflow
Summary: Confusing wording for -Wformat-overflow
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 7.0
: P3 enhancement
Target Milestone: ---
Assignee: David Malcolm
URL: https://gcc.gnu.org/ml/gcc-patches/20...
Keywords: diagnostic, patch
Depends on:
Blocks: Wformat-overflow
  Show dependency treegraph
 
Reported: 2016-09-22 20:01 UTC by David Malcolm
Modified: 2022-05-27 08:04 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-09-23 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Malcolm 2016-09-22 20:01:22 UTC
Given this example from PR 77672...

****************************************************************************
cat v.c && ./xgcc -B. -c v.c -Wall
char d[3];

extern int sprintf (char*, const char*, ...);

void f (void) {
  sprintf (d, "%-s!", "abc");
}

void g (void) {
  sprintf (d, "%-s", "abc");
}
v.c: In function ‘f’:
v.c:6:19: warning: writing format character ‘!’ at offset 3 past the end of the destination [-Wformat-length=]
   sprintf (d, "%-s!", "abc");
                   ^
v.c:6:3: note: format output 5 bytes into a destination of size 3
   sprintf (d, "%-s!", "abc");
   ^~~~~~~~~~~~~~~~~~~~~~~~~~
v.c: In function ‘g’:
v.c:10:15: warning: writing a terminating nul past the end of the destination [-Wformat-length=]
   sprintf (d, "%-s", "abc");
               ^~~~~
v.c:10:3: note: format output 4 bytes into a destination of size 3
   sprintf (d, "%-s", "abc");
   ^~~~~~~~~~~~~~~~~~~~~~~~~
****************************************************************************

...I find the wording of these messages to be confusing.

Consider:
  v.c: In function ‘f’:
  v.c:6:19: warning: writing format character ‘!’ at offset 3 past the end of the destination [-Wformat-length=]
     sprintf (d, "%-s!", "abc");
                     ^
  v.c:6:3: note: format output 5 bytes into a destination of size 3
     sprintf (d, "%-s!", "abc");
     ^~~~~~~~~~~~~~~~~~~~~~~~~~

My first interpretation of the first message:
  "writing format character ‘!’ at offset 3 past the end of the destination"
was that "!" is written 3 bytes after the end of the destination buffer i.e. at d[6], whereas after several re-readings I think it means that "!" is written at offset 3, and that offset 3 is after the end of the destination buffer.

Hence I think this should at least be reworded from:
  "writing format character ‘!’ at offset 3 past the end of the destination"
to:
  "writing format character ‘!’ at offset 3, which is after the end of the destination"

Also, I found myself looking at the sprintf manpage to figure out what the semantics of "format character '!'" were.  AIUI, in this context '!' is merely an ordinary character to be copied unchanged.

If so, a better working would be:
  "writing character ‘!’ at offset 3, which is after the end of the destination"

But better would be to emphasize that this is a buffer overflow, changing the message to:
  "buffer overflow will occur when writing character ‘!’ (written to offset 3)"

A nice tweak would be to display the decl name in array form if the destination supports it:
  "buffer overflow will occur when writing character ‘!’ to ‘p[3]’"

(would it need to be "will occur" vs "can occur"?)

The grammar in the supporting note:
  v.c:6:3: note: format output 5 bytes into a destination of size 3
     sprintf (d, "%-s!", "abc");
     ^~~~~~~~~~~~~~~~~~~~~~~~~~
seems to me to be poor - it's hard to tell whether "format" and "output" are meant as nouns or verbs.

It could be changed from:
  format output 5 bytes into a destination of size 3
to:
  5 bytes will be written into a destination of size 3
which would be much clearer.

In terms of locations, the diagnostics should ideally show the location of the destination decl as the location of the "note"; also, as a tweak, the first location could show an underline from the initial part of the string up to the caret showing the point where buffer overflow occurs:

  v.c: In function ‘f’:
  v.c:6:19: warning: buffer overflow will occur when writing format character ‘!’ to ‘p[3]’ [-Wformat-length=]
     sprintf (d, "%-s!", "abc");
                  ~~~^
  v.c:1:5: note: 5 bytes will be written into a destination of size 3
   char d[3];
        ^ ~

Ideally it would also underline the size of the buffer as a secondary range in the note (as shown above), but sadly I don't think that location information is available.

Similarly, for the 2nd pair of diagnostics, 

  v.c: In function ‘g’:
  v.c:10:15: warning: buffer overflow will occur when writing terminating nul to ‘p[3]’ [-Wformat-length=]
     sprintf (d, "%-s", "abc");
                  ~~~^
  v.c:1:5: note: 4 bytes will be written into a destination of size 3
   char d[3];
        ^ ~

(using the trailing quote for the location of the NUL char; as per PR 77672)

Hope this is constructive
Comment 1 Martin Sebor 2016-09-23 00:49:08 UTC
Thanks for the feedback.  I took the liberty to change the classification of this bug to enhancement.  If you feel like it should be a defect instead please change it back.

The message

  "writing format character ‘!’ at offset 3 past the end of the destination"

says that the '!' is three bytes from the beginning of the format string, and the function (whose name should probably be mentioned) is writing it somewhere past the end of the destination sequence (often but not necessarily always exactly just past the end).  Mentioning the offset is important in case there are multiple exclamation points in the format string.  Note that the wording is meant to be similar or analogous to:

  "writing a terminating nul past the end of the destination"

so it were to change so I think should the other.


When considering changes here I think it would be useful to take a look at all the warnings issued by the pass and the convention they follow.  The pass checks two sets of functions and issues two broad classes of warnings:

1) bounded functions (like snprintf) along with truncation warnings, and
2) unbounded functions (line sprintf) along with "writing past the end" warnings.

For individual directives, the pass also distinguishes two general situations, and uses two kinds of wordings in the warnings to help users tell them apart:

1) a definite problem denoted by the words "truncated writing X bytes" or "writing X bytes into a region" of a given size, and
2) possible problem indicated by the phrase "may be truncated writing between X and Y bytes into a region" or "writing between X and Y bytes into a region" of a given size.

Finally, similarly to individual directives but for format string characters that aren't part of a directive, the pass again distinguishes two general situations, and emits two kinds of wordings in the warnings to tell one from the other

1) a definite premature truncation or buffer overflow: "output truncated before the last character" and "writing a character/nul past the end," and
2) a possible premature truncation or buffer overflow: "output may be truncated before the last character" and "may write a character/nul past the end."

With this background, it might also be helpful to look at some examples.  I quickly put together the test program below.  It doesn't cover all the possible permutations (and the output may change based on the warning level and based on optimization), but it should be a starting point for a more comprehensive survey.  With a more complete picture we should be able to make a more informed decision about the new wording of all the kinds of diagnostic (it could also help find bugs or inconsistencies in the implementation).

$ cat zzz.c && /build/gcc-trunk-svn/gcc/xgcc -B /build/gcc-trunk-svn/gcc -O2 -S -Wall -Wformat -Wextra -Wpedantic -Wformat-length=2 zzz.c
char d[4];

typedef __SIZE_TYPE__ size_t;

extern int sprintf (char*, const char*, ...);
extern int snprintf (char*, size_t, const char*, ...);


void f (int i)
{
  // bounded, definite truncation in a directve
  snprintf (d, sizeof d, "%i", 1235);

  // bounded, definite truncation copying format string
  snprintf (d, sizeof d, "%iAB", 123);

  // unbounded, definite overflow in a directve
  sprintf (d, "%i", 1235);

  // unbounded, definite overflow copying format string
  sprintf (d, "%iAB", 123);

  // bounded, possible truncation a directve
  snprintf (d, sizeof d, "%i", i);

  // bounded, possible overflow copying format string
  snprintf (d, sizeof d, "%iAB", i);

  // unbounded, possible overflow in a directve
  sprintf (d, "%i", i);

  // unbounded, possible overflow copying format string
  sprintf (d, "%iAB", 123);

  // unbounded, possible overflow copying format string
  const char *s = i ? "123" : "1234";
  sprintf (d, "%sAB", s);
}

zzz.c: In function ‘f’:
zzz.c:12:26: warning: output truncated before the last format character [-Wformat-length=]
   snprintf (d, sizeof d, "%i", 1235);
                          ^~~~
zzz.c:12:3: note: format output 5 bytes into a destination of size 4
   snprintf (d, sizeof d, "%i", 1235);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:15:30: warning: output truncated at format character ‘B’ at offset 3 [-Wformat-length=]
   snprintf (d, sizeof d, "%iAB", 123);
                              ^
zzz.c:15:3: note: format output 6 bytes into a destination of size 4
   snprintf (d, sizeof d, "%iAB", 123);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:18:15: warning: writing a terminating nul past the end of the destination [-Wformat-length=]
   sprintf (d, "%i", 1235);
               ^~~~
zzz.c:18:3: note: format output 5 bytes into a destination of size 4
   sprintf (d, "%i", 1235);
   ^~~~~~~~~~~~~~~~~~~~~~~
zzz.c:21:19: warning: writing format character ‘B’ at offset 3 past the end of the destination [-Wformat-length=]
   sprintf (d, "%iAB", 123);
                   ^
zzz.c:21:3: note: format output 6 bytes into a destination of size 4
   sprintf (d, "%iAB", 123);
   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:24:27: warning: ‘%i’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-length=]
   snprintf (d, sizeof d, "%i", i);
                           ^~
zzz.c:24:26: note: using the range [‘1’, ‘-2147483648’] for directive argument
   snprintf (d, sizeof d, "%i", i);
                          ^~~~
zzz.c:24:3: note: format output between 2 and 12 bytes into a destination of size 4
   snprintf (d, sizeof d, "%i", i);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:27:27: warning: ‘%i’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-length=]
   snprintf (d, sizeof d, "%iAB", i);
                           ^~
zzz.c:27:26: note: using the range [‘1’, ‘-2147483648’] for directive argument
   snprintf (d, sizeof d, "%iAB", i);
                          ^~~~~~
zzz.c:27:3: note: format output between 4 and 14 bytes into a destination of size 4
   snprintf (d, sizeof d, "%iAB", i);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:30:16: warning: ‘%i’ directive writing between 1 and 11 bytes into a region of size 4 [-Wformat-length=]
   sprintf (d, "%i", i);
                ^~
zzz.c:30:15: note: using the range [‘1’, ‘-2147483648’] for directive argument
   sprintf (d, "%i", i);
               ^~~~
zzz.c:30:3: note: format output between 2 and 12 bytes into a destination of size 4
   sprintf (d, "%i", i);
   ^~~~~~~~~~~~~~~~~~~~
zzz.c:33:19: warning: writing format character ‘B’ at offset 3 past the end of the destination [-Wformat-length=]
   sprintf (d, "%iAB", 123);
                   ^
zzz.c:33:3: note: format output 6 bytes into a destination of size 4
   sprintf (d, "%iAB", 123);
   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:37:19: warning: may write format character ‘B’ at offset 3 past the end of the destination [-Wformat-length=]
   sprintf (d, "%sAB", s);
                   ^
zzz.c:37:3: note: format output between 6 and 7 bytes into a destination of size 4
   sprintf (d, "%sAB", s);
   ^~~~~~~~~~~~~~~~~~~~~~
Comment 2 Martin Sebor 2017-04-10 21:58:02 UTC
The warning has been renamed to -Wformat-overflow (and -Wformat-truncation). 
Adjusting Summary.
Comment 3 Martin Sebor 2018-04-19 17:10:49 UTC
David, any thoughts/suggestions for how to improve things (taking into consideration comment #1)?
Comment 4 David Malcolm 2018-09-13 13:07:28 UTC
Martin: I had a go at revamping this warning on the trip back from Cauldron, and I have a mostly working prototype of something I like a lot.  I'm tidying it up, and hope to post it to the mailing list sometime in the next day or so (once the FIXME count in my code is at a more reasonable level!).  I'll CC you when I do.  

I'm assigning this one to me, hoping to get it into gcc 9.

In the meantime, I'm going to post some of the other UI ideas for this that we've being chatting about, so that they're captured publicly.
Comment 5 David Malcolm 2018-09-13 13:10:24 UTC
(In reply to David Malcolm from comment #4)
> In the meantime, I'm going to post some of the other UI ideas for this that
> we've being chatting about, so that they're captured publicly.

Consider this problematic call to sprintf:

$ cat demo.c
#include <stdio.h>

const char *test_1 (const char *msg)
{
  static char buf[16];
  sprintf (buf, "msg: %s\n", msg);
  return buf; 
}

void test_2 ()
{
  test_1 ("this is long enough to cause trouble");
}

Right now, we emit this (this is trunk, plus some fixes for line-
numbering bugs):

$ ./xgcc -B. -c demo.c  -Wall -O2
demo.c: In function ‘test_2’:
demo.c:6:23: warning: ‘%s’ directive writing 36 bytes into a region of size 11 [-Wformat-overflow=]
     6 |   sprintf (buf, "msg: %s\n", msg);
       |                       ^~
demo.c:12:11:
    12 |   test_1 ("this is long enough to cause trouble");
       |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
demo.c:6:3: note: ‘sprintf’ output 43 bytes into a destination of size 16
     6 |   sprintf (buf, "msg: %s\n", msg);
       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I brainstormed some ideas on making these kinds of warning easier for
the user to understand.
Comment 6 David Malcolm 2018-09-13 13:11:09 UTC
(In reply to David Malcolm from comment #5)
> I brainstormed some ideas on making these kinds of warning easier for
> the user to understand.

We could use the new labeling-of-source-ranges idea from:
  https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00901.html
to print something like:

demo.c: In function ‘test_2’:
demo.c:6:23: warning: ‘%s’ directive writing 36 bytes into a region of size 11 [-Wformat-overflow=]
     6 |   sprintf (buf, "msg: %s\n", msg);
       |            ~~~        ^~
       |            |          |
       |            |          required space: 36 bytes
       |            remaining capacity: 11 bytes
demo.c:12:11:
    12 |   test_1 ("this is long enough to cause trouble");
       |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       |           |
       |           required space: 36 bytes
demo.c:6:3: note: ‘sprintf’ output 43 bytes into a destination of size 16
     6 |   sprintf (buf, "msg: %s\n", msg);
       |            ~~~   ^~~~~~~~~
       |            |          |
       |            |          required space: 43 bytes
       |            size: 16 bytes

(making a distinction between "size" and "remaining capacity",
depending on whether the code is writing to the start of the buffer or
not)

Underlining "buf" requires access to its source location, which might
not be available yet in the C frontend (if so, I can look at fixing
that).
Comment 7 David Malcolm 2018-09-13 13:12:03 UTC
(In reply to David Malcolm from comment #6)

A tweak to this would be to show the point where the overflow occurs
(if the substring location code is up to it...):

demo.c: In function ‘test_2’:
demo.c:6:23: warning: ‘%s’ directive writing 36 bytes into a region of size 11 [-Wformat-overflow=]
     6 |   sprintf (buf, "msg: %s\n", msg);
       |            ~~~        ^~
       |            |          |
       |            |          required space: 36 bytes
       |            remaining capacity: 11 bytes
demo.c:12:11:
    12 |   test_1 ("this is long enough to cause trouble");
       |                       ^~~~~~~~~~~~~~~~~~~~~~~~~
       |                       |
       |                       overflow occurs here
demo.c:6:3: note: ‘sprintf’ output 43 bytes into a destination of size 16
     6 |   sprintf (buf, "msg: %s\n", msg);
       |            ~~~   ^~~~~~~~~
       |            |          |
       |            |          required space: 43 bytes
       |            size: 16 bytes
Comment 8 David Malcolm 2018-09-13 13:12:49 UTC
(In reply to David Malcolm from comment #5)
> I brainstormed some ideas on making these kinds of warning easier for
> the user to understand.

Getting really fancy, we could emit an ASCII art visualization to
(hopefully) make the buffer overflow crystal-clear (with an option to
disable it):

demo.c:6:3: note: buffer overflow [-fdiagnostics-show-buffer-overflow]
  snprintf of "%s" from:
                        |+---+---+ ... +---+---+|+---+---+ ... +---+---+|
                        ||  0|  1|     |  9| 10||| 11| 12|     | 41| 42||
                        ||'t'|'h'|     |'o'|'n'|||'g'|' '|     |'e'|NUL||
                        |+---+---+ ... +---+---+|+---+---+ ... +---+---+|
                        vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
                        |<---   ok          --->|<--     overflow    -->|
                        |                       |                       |
  to 'buf':
  |                     |                       |
  ++---+---+---+---+---+|+---+---+ ... +---+---+|
  ||  0|  1|  2|  3|   4||5  |  6|     | 14| 15||
  ||'m'|'s'|'g'|':'|' '|||'t'|'h'|     |'o'|'n'||
  ++---+---+---+---+---+|+---+---+ ... +---+---+|
                        |                       |

(thus showing the buffer content where it's known, eliding the middle
when it goes above 5 elements)

The parts on the right-hand side ("overflow" etc) could be colorized in
red.
Comment 9 David Malcolm 2018-09-13 13:13:53 UTC
(In reply to David Malcolm from comment #8)
> (In reply to David Malcolm from comment #5)
> > I brainstormed some ideas on making these kinds of warning easier for
> > the user to understand.
> 
> Getting really fancy, we could emit an ASCII art visualization to

Here's another version of the same UI idea, for the same diagnostic,
which tries to show the string data (in the name of readability; I
think this one is better than the above):

demo.c:6:3: note: buffer overflow...
  snprintf of "%s" from:
            |+-------------+|+---------------------------++---+|
            ||    0 - 10   |||          11  - 41         || 42||
            ||"this is lon"|||"g enough to cause trouble"||NUL||
            |+-------------+|+---------------------------++---+|
            vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
            |<---   ok  --->|<--           overflow         -->|
            |               |                                  |
  to 'buf':
  |         |               |
  ++-------+|+-------------+|
  || 0 - 4 |||   5 - 15    ||
  ||"msg: "|||"this is lon"||
  ++-------+|+-------------+|
Comment 10 David Malcolm 2018-09-13 13:14:55 UTC
(In reply to David Malcolm from comment #5)
> I brainstormed some ideas on making these kinds of warning easier for
> the user to understand.

Here's another possible visualization, of a different overflow:

  snprintf of "%s" from:

                    |+---+ ... +---+|+---+ ... +---|
                    || 0 |     | n |||n+1|     | 31|
                    |+---+ ... +---+|+---+ ... +---|
                    vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
                    |<---   ok  --->|<--overflow-->|
                    |               |              |
  to 'buf':
    |               |               |
    ++---+ ... +---+|+---+ ... +---+|
    || 0 |     | x |||x+1|     | 15||
    ++---+ ... +---+|+---+ ... +---+|

In this one, 32 bytes are being written into an unknown point ("x+1")
within a buffer of size 16.
Comment 11 David Malcolm 2018-09-13 13:15:53 UTC
(In reply to David Malcolm from comment #5)
> I brainstormed some ideas on making these kinds of warning easier for
> the user to understand.

A simple example where the overflowing write is to the start of the
buffer:

sprintf of an unbounded string to a fixed-size buf[100]:

demo.c:6:3: note: buffer overflow...
  snprintf of "%s" from:
  |+------+|+--------------++-------+|
  ||0...99|||100...strlen-1|| strlen||
  ||      |||              ||  NUL  ||
  |+------+|+--------------++-------+|
  vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  |<--ok-->|<--   overflow        -->|
  |        |                         |
  to 'buf':
  |        |
  |+------+|
  ||0...99||
  |+------+|
Comment 12 David Malcolm 2018-09-13 13:16:49 UTC
(In reply to David Malcolm from comment #5)
> I brainstormed some ideas on making these kinds of warning easier for
> the user to understand.

Copying a string to a buffer allocated with strlen(), rather than
strlen() + 1:

  |                  |          |
  |+----------------+|+--------+|
  ||0...strlen() - 1||   NUL   ||
  |+----------------+|+--------+|
  vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  |<---   ok      -->|<overflow>|
  |                  |          |
  to 'buf':
  |                  |
  |+----------------+|
  ||0...strlen() - 1||
  |+----------------+|

(drawing the user's attention to the NUL terminator, because it's such
a common source of mistakes)

FWIW, I have some classes for laying out ASCII art drawings which could be
reused for this, as part of this vectorization RFC:
  https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01576.html
Comment 13 David Malcolm 2018-09-13 13:20:31 UTC
Some more examples to consider (thanks Martin):

   struct MyStrings {
     char a[8], b[20];
   };

   const struct MyStrings ms[] = {
      { "foo", "bar" }, { "abcd", "klmno" }, ...
   };

Consider:

   sprintf (smallbuf, "msg: %s\n", ms[1].b);

vs this case where we don't know which one we're using:

   sprintf (smallbuf, "msg: %s\n", ms[idx].b);
Comment 14 David Malcolm 2018-09-13 13:22:24 UTC
Another example from Martin:

   extern char buf[80];
   extern char tmpdir[80];
   extern char fname[8];

   void f (int num)
   {
     sprintf (buf, "/%s/%s-%i.tmp", tmpdir, fname, num);
   }

In the most basic case (nothing beyond what's in the above is
known) GCC currently prints the rather mystifying:

   warning: ‘/’ directive writing 1 byte into a region of size between 0 
and 79 [-Wformat-overflow=]
   7 |   __builtin_sprintf (buf, "/%s/%s-%i.tmp", tmpdir, fname, num);
     |                               ^

We need to make this more "actionable" for the end-user.
Comment 15 David Malcolm 2018-09-13 13:23:27 UTC
(In reply to David Malcolm from comment #14)

Some random UI ideas (brainstorming in Emacs, so no idea if these are
*good* ideas):

   warning: buffer overflow: writing 9-110 bytes into a buffer with capacity 80 [-Wformat-overflow=]
   7 |   __builtin_sprintf (buf, "/%s/%s-%i.tmp", tmpdir, fname, num);
     |                      ~~~   ~~~^~~~~~~~~~
     |                      |        |
     |                      |        writing 9...110 bytes
     |                      capacity: 80 bytes
   note: details
   7 |   __builtin_sprintf (buf, "/%s/%s-%i.tmp", tmpdir, fname, num);
     |                      ~~~   ab~cd~ef~g~~~h
     |                      |     || || || |   |
     |                      |     || || || |   1 byte (NUL terminator)
     |                      |     || || || 4 bytes (".tmp")
     |                      |     || || |1...16 bytes ("%i" on 'num')
     |                      |     || || 1 byte ("-")
     |                      |     || |0-8 bytes ("%s" on 'fname')
     |                      |     || 1 byte ("/")
     |                      |     |0-79 bytes ("%s" on 'tmpdir')
     |                      |     1 byte ("/")
     |                      capacity: 80 bytes
   note: layout [-fsome-option-enabling-this]

(with alternating colorization to better distinguish all those ranges and labels)

       |+---+--------------+----+-------------+-----+-----------+------+------+
start @||0  |1             |1-80|2-81         |10-89|11-90      |12-106|16-110|
  size:||  1|     0 - 79   |   1|            8|    1|    1-16???|     4|     1|
       ||"/"|%s on 'tmpdir'| "/"|%s on 'fname'| "-" |%i on 'num'|".tmp"|  NUL |
       |+---+--------------+----+-------------+-----+-----------+------+------+

or with a vertical orientation:

   note: layout [-fsome-option-enabling-this]
   +----------------+-----------+--------+---------------+
   |element         |starting at|    size|cumulative size|
   +----------------+-----------+--------+---------------+
   |"/"             |         0 |      1 |             1 |
   |"%s" on 'tmpdir'|         1 | 0 - 79 |        1 - 80 |
   |"/"             |    1 - 80 |      1 |        2 - 81 |
   |"%s" on 'fname' |    2 - 81 |  0 - 7 |        2 - 88 |
   |"-"             |    2 - 88 |      1 |        3 - 89 |
   |"%i" on 'num'   |    3 - 89 |  1- 16 |       4 - 105 |
   |".tmp"          |   4 - 105 |      4 |       8 - 109 |
   |NUL terminator  |   8 - 109 |      1 |       9 - 110 |
   +----------------+-----------+--------+---------------+

(I've probably got some of the numbers wrong above, but hopefully you
get the idea of where I'm going with this).

Maybe some kind of highlight to show where we can exceed the buffer
capacity.

I like calling out the NUL terminator explicitly (as it's so easy to
get wrong), and putting "buffer overflow" upfront in the text of the
warning, as I did above.
Comment 16 Manuel López-Ibáñez 2018-09-13 18:12:32 UTC
To be honest, I find the fancy UI too overwhelming and a bit redundant in simple cases.

It is true that the diagnostics could be improved by some tweaks. I took some of the ideas here and I came up with the following flexible template. 

1. First the main warning has only four variants, depending on whether we are writing a fixed number or a range and whether we are sure or not:

warning: formatting %d bytes will/may overflow buffer '%BUFF' of size %BUFFSIZE

warning: formatting between %d and %d bytes will/may overflow buffer '%BUFF' of size %BUFFSIZE

Example:

demo.c:6:3: warning: formatting 43 bytes will overflow 'buff' of size 16 [-Wformat-overflow=]
     6 |   sprintf (buf, "msg: %s\n", msg);
       |            ~~~   ^~~~~~~~~

2. The next note explains the details (for all cases in comment #1):

// bounded, definite truncation copying format string
zzz.c:15:30: note: the format string will be truncated at character 'B' after %d bytes
   snprintf (d, sizeof d, "%iAB", 123);
                           ~~~^
// unbounded, definite overflow in a directve
zzz.c:18:15: note: the terminating nul will be written after the end of 'd'
   sprintf (d, "%i", 1235);
            ^
// unbounded, definite overflow copying format string
zzz.c:21:19: note: character ‘B’ of the format string will be written after the end of 'd'
   sprintf (d, "%iAB", 123);
                ~~~^
(I don't think we need to specify the offset, the column number and the caret line should already point to the correct 'B', otherwise, it is more important to get that fixed)

// bounded, possible truncation a directive
zzz.c:24:27: note: argument 4 corresponding to ‘%i’ has range [‘1’, ‘-2147483648’] and may be truncated after 4 bytes
   snprintf (d, sizeof d, "%i", i);
                           ~^   ~
                           
// bounded, possible truncation copying format string
zzz.c:27:27: note: argument 4 corresponding to '%i' has range [‘1’, ‘-2147483648’] and may be truncated after 4 bytes 
   snprintf (d, sizeof d, "%iAB", i);
                           ~^     ~
(these may look different, but they are exactly the same case)

// unbounded, possible overflow in a directive
zzz.c:30:16: note: argument 4 corresponding to '%i' has range [‘1’, ‘-2147483648’] and may write between 1 and 11 bytes after the end of 'd'
   sprintf (d, "%i", i);
                ~^   ~
// unbounded, possible overflow copying format string
zzz.c:37:19: note: character 'B' of the format string may be written after the end of 'd'
   sprintf (d, "%sAB", s);
                ~~~^
3. And additional note may explain where the argument comes from for those cases in (2) where we mention an argument:

demo.c:12:11: the value of 'msg' comes from here
    12 |   test_1 ("this is long enough to cause trouble")


I think it is not very useful to complicate it further because we would need to consider cases where:

1. There are multiple directives of variable size.
2. The format string may be very long (too long to print).
3. The expanded arguments may be very long (too long to print).

I agree that in those cases, something like this:


   note: layout [-fsome-option-enabling-this]
   +----------------+-----------+--------+---------------+
   |format string   |starting at|    size|cumulative size|
   +----------------+-----------+--------+---------------+
   |"/"             |         0 |      1 |             1 |
   |"%s" on 'tmpdir'|         1 | 0 - 79 |        1 - 80 |
   |"/"             |    1 - 80 |      1 |        2 - 81 |
   |"%s" on 'fname' |    2 - 81 |  0 - 7 |        2 - 88 |
   |"-"             |    2 - 88 |      1 |        3 - 89 |
   |"%i" on 'num'   |    3 - 89 |  1- 16 |       4 - 105 |
   |".tmp"          |   4 - 105 |      4 |       8 - 109 |
   |NUL terminator  |   8 - 109 |      1 |       9 - 110 |
   +----------------+-----------+--------+---------------+

could be useful (but I would argue that the column 'starting at' is redundant and that there should be a column 'argument' that gives the numerical number of the argument for directives, because the argument may be something much much longer than 'num' or even something that we cannot pretty-print).

In any case, I would not over-complicate the default diagnostics with many details. Just have the option to generate the table under some -f option.
Comment 17 David Malcolm 2018-09-14 17:43:32 UTC
Prototype of a new approach posted here:
  "[PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696)"
    https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00771.html
Comment 18 Jakub Jelinek 2019-05-03 09:17:03 UTC
GCC 9.1 has been released.
Comment 19 Jakub Jelinek 2019-08-12 08:56:11 UTC
GCC 9.2 has been released.
Comment 20 Jakub Jelinek 2020-03-12 11:58:50 UTC
GCC 9.3.0 has been released, adjusting target milestone.
Comment 21 Richard Biener 2021-06-01 08:08:20 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.