[PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

Martin Sebor msebor@gmail.com
Wed Aug 24 16:41:00 GMT 2016


On 08/23/2016 05:00 PM, Joseph Myers wrote:
> Some observations:
>
> * Does -fprintf-return-value allow for the possibility of snprintf failing
> because of a memory allocation failure and so returning -1 when GCC
> computed bounds on what it could return if successful?

No.  I recall having seen Glibc fail with ENOMEM years ago when
formatting a floating point number to a very large precision but
I haven't seen any implementation fail.  I haven't yet looked to
see if the Glibc failure can still happen.  My reading of C and
POSIX is that snprintf is only allowed to fail due to an encoding
error, not because it runs out of memory, so such a failure would
seem like a bug.

At the same time, the cause of the failure doesn't really matter.
If the function could fail, unless GCC could determine the failure
at compile time and avoid the optimization, the transformation
wouldn't be guaranteed to be safe.  It might be something to
consider and possibly accommodate in the implementation.  It's
one of the reasons why I want to expose the optimization to
more code before enabling it.

I also haven't yet thought about how to deal with it but if it
is a possibility we want to allow for then maybe a target hook
for libc implementers to set to indicate whether sprintf can fail
and when would work.  Libc implementations that can fail under
any conditions (whether allowed by the standard or not) would
need to disable (or not enable, depending on the default) the
optimization.  I'm certainly open to other ideas.

> * It looks like you convert to (signed/unsigned) char for %hh formats,
> etc.  Now, there is the possibility that the value passed was actually of
> type int, and out of range for those types.  And there is the possibility
> that the implementation might not itself convert those values to char /
> short (glibc didn't until 2006) - passing a value outside the range of the
> relevant type seems likely undefined behavior, so implementations may not
> actually need to convert, and there's an open question about whether the
> value actually needs to have been promoted from char/short in the caller
> (see my <https://www.polyomino.org.uk/computer/c/pre-dr-6a.txt>).  I don't
> know if you wish to allow at all for this issue.

It sounds like the concern is that for the following call (when
UCHAR_MAX is 255):

   sprintf (d, "%hhu", 1000)

some implementation (an old version of Glibc?) may have actually
produced four digits and returned 4 on the basis of C saying that
the %hhu argument must be an unsigned char (promoted to int) and
thus the behavior of the call being undefined.

I wouldn't think this to be the intended interpretation (C11 says
the argument "value shall be converted to signed char or unsigned
char before printing") but if it really was meant to be undefined
then having GCC treat the undefined call differently than libc
shouldn't be a problem.  That said, I've tried not to allow the
optimization for calls with undefined behavior so if there was
a serious concern about this causing trouble I could see about
detecting this case and disabling it.

Martin



More information about the Gcc-patches mailing list