Bug 65446 - Improve -Wformat-signedness
Summary: Improve -Wformat-signedness
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks: cppcheck
  Show dependency treegraph
 
Reported: 2015-03-17 08:51 UTC by Tobias Burnus
Modified: 2024-01-16 04:45 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-06-09 00:00:00


Attachments
Simple test case (331 bytes, text/x-csrc)
2016-06-09 13:23 UTC, Ricardo Nabinger Sanchez
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2015-03-17 08:51:20 UTC
As PR 65040 shows, the current implementation of -Wformat-signedness is not optimal. (When it is more robust, one could consider to re-enable it with -Wformat=2.)

The idea of the warning is to warn for
   "%ld",   size_t_variable
as one has to use "%lu" to print the negative values. Or reversely using %u for a signed value, where it is even more likely that the issue occurs.

See also "cppcheck --enable=warning" which supports this warning. (But its warning pattern makes more sense than GCC's.)


GCC's current implementation warns too often - and misses some cases. The main bug is that it doesn't take type promotion into account. In particular:

It warns for:
   printf ("%u\n", unsigned_short);
but shouldn't: First, %u and unsigned_short are both unsigned. (It also shouldn't and doesn't warn for %d with unsigned short as all values are representible by %d.

It doesn't warn but should warn for
   printf("%u\n", _short);

As passing, e.g., (short)-1 to %u will print the wrong value (UINT_MAX instead of -1).


GCC currently also warns for printf("%x\n", 1), which is very questionable.
Comment 1 Tobias Burnus 2015-03-17 08:52:59 UTC
(In reply to Tobias Burnus from comment #0)
> It doesn't warn but should warn for
>    printf("%u\n", _short);

Actually, it (correctly) does warn in this case (as short it promoted to int, which is also unsigned).
Comment 2 Eric Blake 2015-05-21 16:11:55 UTC
'unsigned short' promotes to 'int', not 'unsigned int', when passed through var-args.  The warning is technically correct, but annoying, since all values of 'unsigned short' are representable in both 'int' and in 'unsigned int', therefore %d vs. %u won't ever have to deal with a negative value.

"%u" with signed 'short' should indeed warn, because that is a case where promotion to 'int' differs from printing unsigned values.

"%x" with 1 should indeed warn, because that is passing 'int'.  Use "%x" with 1U if you want to avoid the warning.
Comment 3 Eric Blake 2015-05-21 16:58:34 UTC
see also bug 66249 where the implementation-defined signedness of enums comes into play, and where I argue that neither %d nor %u should warn when an enum type is passed through varargs where the range of the enum is identical through either format representation.
Comment 4 Eric Blake 2015-05-21 17:26:36 UTC
Arguably, "%u" with short should warn, while "%hu" with short should not.  On the other hand, if I use "%hu" with int, it is unclear to me whether I should get a warning (the fact that I'm using %h to intentionally truncate the value before it is printed, even though %hu and %hd print different values, makes it hard to discern whether I have a mismatch).  But this does mean that whatever changes are made to -Wformat-signedness, it should take into account the use of %h range-limiting vs. pre-integer-promotion types.
Comment 5 Ricardo Nabinger Sanchez 2016-06-09 13:23:28 UTC
Created attachment 38666 [details]
Simple test case

I understand the reasoning behind the implicit promotions.

Here is my rationale, asking for the warning to be emitted nonetheless: suppose the first test in the test case I supplied is used in code that emits assembly for an assembler.  If the programmer mistakenly used %hu for a int16 type, and that happened to generate a jump address, the implicit promotion would silently hide an important error (possibly jumping to the opposite direction).  The printed-out value is clearly incorrect.  It's not a compiler fault, but it just so happens that the compiler is in the best position to warn the user that s/he might run into problems.  Let the programmer decide whether this should be looked into (fixing his/her code) or if s/he will ignore.

In my case, I noticed this issue (silent signedess-mismatch) because I was getting negative ports (IPv4/IPv6) on someone else's code, which was only a visual nuisance.  It would have been caught with the warning, but there was no way I could get one.  Clang failed to generate such a warning as well (and in fact produces the same output).

The same happens with %hhd/%hhu specifiers when given mismatching uint8_t/int8_t.

In short: please warn.  Some cases will generate unexpected results (which are not compilation errors), simply because the programmer used an incorrect format specifier.
Comment 6 Manuel López-Ibáñez 2016-06-09 17:24:30 UTC
(In reply to Ricardo Nabinger Sanchez from comment #5)
> In short: please warn.  Some cases will generate unexpected results (which
> are not compilation errors), simply because the programmer used an incorrect
> format specifier.

I think there is no doubt that we want to warn in cases that lead to different output (%u + short) or UB and avoid warning in cases that do not have any effect (%u + unsigned short).

The question is how and who will implement these heuristics: https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps

You are welcome to try. All the relevant code is in https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c-family/c-format.c;h=c19c411d61a44773c55c8686802787ec7e4c40a4;hb=HEAD