[PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

Joseph Myers joseph@codesourcery.com
Wed Oct 5 23:55:00 GMT 2016

On Wed, 5 Oct 2016, Martin Sebor wrote:

> On 10/05/2016 05:11 PM, Joseph Myers wrote:
> > On Wed, 5 Oct 2016, Martin Sebor wrote:
> > 
> > > may have been subjected to.  Issuing a warning for a safe piece
> > > of code only on the basis that there might be some other piece
> > > of code in the program that does something wrong makes no sense.
> > > Suggesting to the user to change the safe piece of code is
> > > misleading and counterproductive.
> > 
> > It's warning because it's *bad code*.
> Why?  What can go wrong and on what system?  Please name at least
> one specific example.

It's bad because getting types wrong (more generally than just in this 
case) is a careless coding practice.  It's a code smell.  By requiring 
much more careful reasoning about whether it is OK in a particular 
context, given knowledge of ABIs and compiler optimizations, such code 
smells are liable to distract attention from other problems in the code, 
as well as being symptomatic of possible other problems in the code.

It should be possible to accommodate -Wall within a wide range of code 
styles, but you may will still need to adapt your code to it in some ways 
if you are doing things that are commonly dubious, even if you have 
reasons why they are safe in the cases that appear in your code and in the 
contexts in which that code will be used.

> > I mean high parts if the ABI is passing those arguments in 64-bit
> > registers / stack slots and has requirements on how 32-bit arguments are
> > extended for passing as 64-bit.
> I don't understand what you mean.  We're talking about passing
> an integer (wchar_t) via the ellipsis.  That's just as well
> defined as passing any other integer of that size, as is (for
> all practical purposes) extracting it via an integer of the
> opposite signedness.  printf then stuffs the extracted wint_t
> value (which is of the same size as the original wchar_t) into
> a wchar_t, appends a nul, and formats the result via %ls (at
> least that's how it's specified).

Consider the powerpc64 ABI used for GNU/Linux (both BE and LE ABIs have 
this property).  A 32-bit argument is passed in a register (or in memory 
after a certain point), sign-extended to 64-bit if of a signed type and 
zero-extended to 64-bit if of an unsigned type.  wchar_t is int and wint_t 
is unsigned int in this case.  So if you pass a negative wchar_t argument, 
and the callee uses va_arg with type wint_t to access it, the compiler 
compiling the callee is entitled to optimize it on the basis that the 
value is already zero-extended to 64 bits, resulting in undefined behavior 
because the caller in fact sign-extended it.  Even if GCC doesn't optimize 
on that basis today, it could in future - and with a printf call, it's 
very likely that code compiled now will be used with a future libc.so 
shared library compiled with a newer compiler version, so printf may be 
built with a newer compiler than the code calling it.

The problem is that extracting via an integer of the opposite signedness 
is *not* defined unless the argument has a value representable in both 
types - both as a matter of ISO C rules on variadic functions, and as a 
practical matter of what various ABIs say about how sub-word arguments are 
extended when passed as function arguments.

(Negative wchar_t isn't actually a valid wide character, but without 
knowing where the wchar_t came from it's entirely possible it could have a 
value of type wchar_t that doesn't represent a valid character.  If you 
properly pass a wint_t value, then you avoid undefined behavior for 
invalid characters, since they are defined to produce an EILSEQ error.)

But my basic point is: if something can lead to analysis in this level of 
detail of whether the code is or is not safe in a particular context, 
while there is a trivial fix to the code that would short-circuit all that 
analysis, that by itself is enough evidence that the code deserves a 
warning and should be cleaned up to make it more obviously safe.

Joseph S. Myers

More information about the Gcc-patches mailing list