[PATCH v2] middle-end/104854: Limit strncmp overread warnings

Jeff Law jeffreyalaw@gmail.com
Thu Mar 17 16:46:33 GMT 2022


On Thu, Mar 17, 2022 at 9:32 AM Marek Polacek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Wed, Mar 16, 2022 at 06:54:51AM +0530, Siddhesh Poyarekar wrote:
> > On 16/03/2022 02:06, Martin Sebor wrote:
> > > The intended use of the strncmp bound is to limit the comparison to
> > > at most the size of the arrays or (in a subset of cases) the length
> > > of an initial substring. Providing an arbitrary bound that's not
> > > related to the sizes as you describe sounds very much like a misuse.
> >
> > Nothing in the standard says that the bound is related to the sizes of
> input
> > buffers.  I don't think deducing that intent makes sense either, nor
> > concluding that any other use case is misuse.
> >
> > > As a historical note, strncmp was first introduced in UNIX v7 where
> > > its purpose, alongside strncpy, was to manipulate (potentially)
> > > unterminated character arrays like file names stored in fixed size
> > > arrays (typically 14 bytes).  Strncpy would fill the buffers with
> > > ASCII data up to their size and pad the rest with nuls only if there
> > > was room.
> > >
> > > Strncmp was then used to compare these potentially unterminated
> > > character arrays (e.g., archive headers in ld and ranlib).  The bound
> > > was the size of the fixed size array.  Its other use case was to
> compare
> > > leading portions of strings (e.g, when looking for an environment
> > > variable or when stripping "./" from path names).
> >
> > Thanks for sharing the historical perspective.
> >
> > > Since the early UNIX days, both strncpy and to a lesser extent strncmp
> > > have been widely misused and, along with many other functions in
> > > <string.h>, a frequent source of bugs due to common misunderstanding
> > > of their intended purpose.  The aim of these warnings is to detect
> > > the common (and sometimes less common) misuses and bugs.
> >
> > They're all valid uses however since they do not violate the standard.
> If we
> > find at compile time that the strings don't terminate at the bounds,
> > emitting the warning is OK but the more pessimistic check seems like
> > overkill.
>
> I think I agree, I've tried
>
> #include <string.h>
> char a[] = "abc";
> char b[] = "abcd";
>
> int f (void)
> {
>   return strncmp (a, b, 8);
> }
>
> where I get
>
> t.c:7:10: warning: ‘strncmp’ specified bound 8 exceeds source size 5
> [-Wstringop-overread]
>     7 |   return strncmp (a, b, 8);   // -Wstringop-overread
>       |          ^~~~~~~~~~~~~~~~~
>
> even without -Wall.  strncmp sees that a[3] is '\0' so it stops comparing
> and there's no UB.
>
This one is a clear case where warning is bad.   Both arguments are
constant and we can determine they are NUL terminated and an overread will
never occur.  No deep analysis really needed here.

THe far more interesting case in my mind is when one or both arguments have
an unknown NUL termination state.  I could argue either side of that case.
I lean towards warning but I understand that opinions differ and my
priorities have moved away from distro-level issues, so identifying code
that needs a careful review for correctness, particularly old or security
sensitive code, has become a much lower priority for me.   Combine that
with the fact that we're really just dealing with over-reads here, I can
support whatever the broadest consensus is.

jeff


More information about the Gcc-patches mailing list