Bug 108036 - [12/13/14/15 Regression] Spurious warning for zero-sized array parameters to a function
Summary: [12/13/14/15 Regression] Spurious warning for zero-sized array parameters to ...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 12.2.0
: P2 normal
Target Milestone: 12.5
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks: Wstringop-overflow
  Show dependency treegraph
 
Reported: 2022-12-09 17:24 UTC by Alejandro Colomar
Modified: 2024-07-19 13:18 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work: 10.1.0
Known to fail: 11.1.0
Last reconfirmed: 2022-12-09 00:00:00


Attachments
Full testcase that actually compiles (489 bytes, text/plain)
2022-12-09 19:09 UTC, Andrew Pinski
Details
Reduced testcase (99 bytes, text/plain)
2022-12-09 19:13 UTC, Andrew Pinski
Details
signature.asc (656 bytes, application/pgp-signature)
2024-03-07 10:53 UTC, Alejandro Colomar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alejandro Colomar 2022-12-09 17:24:17 UTC
It's interesting to pass pointers to one past the end of an array to a function, acting as a sentinel value that serves as an alternative to the size of the buffer.  It helps chaining string copy functions, for example:


char *
ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
{
        bool       trunc;
        char       *end;
        ptrdiff_t  len;

        if (dst == past_end)
                return past_end;

        trunc = false;
        len = strnlen(src, n);
        if (len > past_end - dst - 1) {
                len = past_end - dst - 1;
                trunc = true;
        }

        end = mempcpy(dst, src, len);
        *end = '\0';

        return trunc ? past_end : end;
}


However, if you use array syntax for it, which clarifies where it points to, the GCC complains, not at the function implementation, but at call site:


#define nitems(arr)  (sizeof((arr)) / sizeof((arr)[0]))

int
main(void)
{
        char pre[4] = "pre.";
        char *post = ".post";
        char *src = "some-long-body.post";
        char dest[100];
         char *p, *past_end;

        past_end = dest + nitems(dest);
        p = dest;
        p = ustr2stpe(p, pre, nitems(pre), past_end);
        p = ustr2stpe(p, src, strlen(src) - strlen(post), past_end);
        p = ustr2stpe(p, "", 0, past_end);
        if (p == past_end)
                fprintf(stderr, "truncation\n");

        puts(dest);  // "pre.some-long-body"
}

$ cc -Wall -Wextra ustr2stpe.c
ustr2stpe.c: In function ‘main’:
ustr2stpe.c:43:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    43 |         p = ustr2stpe(p, pre, nitems(pre), past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:43:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~
ustr2stpe.c:44:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    44 |         p = ustr2stpe(p, src, strlen(src) - strlen(post), past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:44:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~
ustr2stpe.c:45:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    45 |         p = ustr2stpe(p, "", 0, past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:45:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~
ustr2stpe.c:43:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    43 |         p = ustr2stpe(p, pre, nitems(pre), past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:43:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~
ustr2stpe.c:44:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    44 |         p = ustr2stpe(p, src, strlen(src) - strlen(post), past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:44:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~
ustr2stpe.c:45:13: warning: ‘ustr2stpe’ accessing 1 byte in a region of size 0
[-Wstringop-overflow=]
    45 |         p = ustr2stpe(p, "", 0, past_end);
       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ustr2stpe.c:45:13: note: referencing argument 4 of type ‘char[0]’
ustr2stpe.c:10:1: note: in a call to function ‘ustr2stpe’
    10 | ustr2stpe(char *dst, const char *restrict src, size_t n, char past_end[0])
       | ^~~~~~~~~


The warnings are invalid.  While it's true that I'm referencing a pointer of
size 0, it's false that I'm "accessing 1 byte" in that region.  I guess this is
all about the bogus design of 'static' in ISO C, where you can have an array
parameter of size 0, which is very useful in cases like this one.


See the original report in the mailing list, where Richard Biener had some guess of what might be the reason:
<https://gcc.gnu.org/pipermail/gcc/2022-December/240230.html>
Comment 1 Andrew Pinski 2022-12-09 19:09:21 UTC
Created attachment 54055 [details]
Full testcase that actually compiles
Comment 2 Andrew Pinski 2022-12-09 19:13:23 UTC
Created attachment 54056 [details]
Reduced testcase
Comment 3 Alejandro Colomar 2022-12-09 19:18:03 UTC
Hi Andrew!

Just a few nitpicks:

-  In the first testcase you posted, the [] is missing the 0: [0].

-  In the reduced test case, you call the pointer to one past the end as 'end'.  That is misleading, since 'end' is commonly also used for pointers to the last byte in an array, normally the NUL byte in strings.  Using the term 'end' meaning one-past-the-end is likely to end up in off-by-one errors.  So much that I found a few of them for exactly that reason this week :)

This last point is why I like using array syntax, so I can clrealy specify 'end[1]' and 'past_end[0]', and they are clearly different things.

Cheers,

Alex
Comment 4 Andrew Pinski 2022-12-09 19:22:41 UTC
(In reply to Alejandro Colomar from comment #3)
> -  In the reduced test case, you call the pointer to one past the end as
> 'end'.  That is misleading, since 'end' is commonly also used for pointers
> to the last byte in an array, normally the NUL byte in strings. 

In the C++ standard, the function end() returns one past the last element of an array. So I am not misusing the name end here. Just using it different from you.
Comment 5 Alejandro Colomar 2022-12-09 19:23:12 UTC
Interesting.  Thanks for clarifying :)
Comment 6 Richard Biener 2022-12-21 11:45:19 UTC
I read Martins response on the mailing list as if special-casing T[0] would be OK and that this is simply missed right now.
Comment 7 Jakub Jelinek 2023-05-29 10:07:47 UTC
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
Comment 8 Daniel Lundin 2024-03-07 10:18:12 UTC
I don't believe char past_end[0] is valid C, because it is an invalid array declaration. Unlike [] or [*] that declares an array of incomplete type. 

Since it is a function parameter, it will of course later get adjusted to a pointer to the first element. But it still has to be a valid declaration to begin with. Similarly, char arr[][] is invalid because it remains an incomplete type after adjustment (see C17 6.7.6.4 §4).

gcc does allow [0] as an extension since that was commonly used for purposes of implementing the "struct hack" back in the days before flexible array members were standardized.

The conclusion ought to be that gcc should let [0] through if compiled in -std=gnu23 mode but not in -std=c23 and/or -pedantic.
Comment 9 Alejandro Colomar 2024-03-07 10:53:46 UTC
Created attachment 57644 [details]
signature.asc

Hi Lundin!

On Thu, Mar 07, 2024 at 10:18:12AM +0000, daniel.lundin.mail at gmail dot com wrote:
> --- Comment #8 from Daniel Lundin <daniel.lundin.mail at gmail dot com> ---
> I don't believe char past_end[0] is valid C, because it is an invalid array
> declaration. Unlike [] or [*] that declares an array of incomplete type. 
> 
> Since it is a function parameter, it will of course later get adjusted to a
> pointer to the first element. But it still has to be a valid declaration to
> begin with. Similarly, char arr[][] is invalid because it remains an incomplete
> type after adjustment (see C17 6.7.6.4 §4).

Agree; ISO C seems to not allow this with their wording.  (I wish it did,
because it's just a matter of wording, not that they don't allow passing
a pointer to past the end).  But maybe the wording needed for allowing
this would have other undersirable consequences, so I'm happy as long as
GNU C would support this.

> gcc does allow [0] as an extension since that was commonly used for purposes of
> implementing the "struct hack" back in the days before flexible array members
> were standardized.
> 
> The conclusion ought to be that gcc should let [0] through if compiled in
> -std=gnu23 mode but not in -std=c23 and/or -pedantic.

And agree; if support for this is added, pedantic or ISO C modes should
complain about it.

Have a lovely day!
Alex
Comment 10 Richard Biener 2024-07-19 13:18:46 UTC
GCC 11 branch is being closed.