The test-case is reduced from acl: $ cat x.c #include <stdio.h> #include <stdlib.h> #include <string.h> struct __string_ext { char s_str[0]; }; struct string_obj_tag { struct __string_ext i; }; typedef struct string_obj_tag string_obj; static void writeto(char *text_p, ssize_t size) { fprintf (stderr, "Write to: %p, size=%d\n", text_p, size); strncpy(text_p, "sparta", size); } int main() { ssize_t size = 30; string_obj *string_obj_p = (string_obj *)malloc (sizeof(string_obj) + size); fprintf (stderr, "allocated: %d B starting at %p\n", size, string_obj_p->i.s_str); writeto(string_obj_p->i.s_str, size); fprintf (stderr, "result STR(%p)=%s\n", string_obj_p->i.s_str, string_obj_p->i.s_str); return 0; } $ gcc x.c -D_FORTIFY_SOURCE=2 -O2 && ./a.out In file included from /usr/include/string.h:535, from x.c:3: In function ‘strncpy’, inlined from ‘writeto’ at x.c:19:3, inlined from ‘main’ at x.c:28:3: /usr/include/bits/string_fortified.h:95:10: warning: ‘__builtin___strncpy_chk’ writing 30 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=] 95 | return __builtin___strncpy_chk (__dest, __src, __len, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 96 | __glibc_objsize (__dest)); | ~~~~~~~~~~~~~~~~~~~~~~~~~ allocated: 30 B starting at 0x4052a0 Write to: 0x4052a0, size=30 *** buffer overflow detected ***: terminated Aborted (core dumped) While clang is fine: $ clang x.c -D_FORTIFY_SOURCE=2 -O2 && ./a.out allocated: 30 B starting at 0x4052a0 Write to: 0x4052a0, size=30 result STR(0x4052a0)=sparta and ASAN,UBSAN as well: $ gcc-11 x.c -fsanitize=address,undefined && ./a.out allocated: 30 B starting at 0x603000000040 Write to: 0x603000000040, size=30 result STR(0x603000000040)=sparta I see the error happens also with older GCC compilers.
Likely related to PR101836?
Note the test-case is reduced from acl package (with -D_FORTIFY_SOURCE=3) that used to work with -D_FORTIFY_SOURCE=2. So maybe my reduction was too aggressive or should the current master support trailing arrays?
It's likely the same what was mentioned here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92815#c3
All right, so this one is likely an invalid code: cat x.c #include <stdio.h> #include <stdlib.h> #include <string.h> struct bad_struct { struct { char s_str[1]; } i; }; struct good_struct { char s_str[1]; }; ssize_t size = 30; struct bad_struct *bad; struct good_struct *good; int main() { good = (struct good_struct *)malloc (sizeof(struct good_struct) + size); char *str = good->s_str; strcpy (str, "sparta"); bad = (struct bad_struct *)malloc (sizeof(struct bad_struct) + size); char *str2 = bad->i.s_str; strcpy (str2, "sparta"); return 0; } It shows the difference in between wrapped struct and not wrapped one.
I'm not 100% sure if it's invalid code, but I was just about to write that it depends on what the pass ends up seeing. If earlier passes end up optimizing the code such that the objsz pass sees the malloc first (e.g. the reproducer in pr104961), it ends up with the malloc'd size, otherwise it ends up with the declared size. So if it was: struct bad_struct { struct g { char s_str[1]; } i; }; and struct g *i = &bad->i; strcpy (i->s_str, "sparta"); then i tends to get optimized as a MEM_REF of the malloc'd block, letting us see the extra space. This needs to be fixed, but then it's possibly a different bug from the one you're seeing in acl since this affects __bos too, not just __bdos. (I'm off in a couple of hours btw, returning on Tuesday so I may not get to it until then)
I've got something similar that can be seen in libqt5core library: QByteArray qt_readlink(const char *path) { #ifndef PATH_MAX // suitably large value that won't consume too much memory # define PATH_MAX 1024*1024 #endif QByteArray buf(256, Qt::Uninitialized); ssize_t len = ::readlink(path, buf.data(), buf.size()); ... where they use something like: struct QArrayData { ... void *data() { return reinterpret_cast<char *>(this) + offset; } Can't easily reproduce a small test can thought.
Created attachment 52679 [details] libacl/__acl_to_any_text.c with FS == 2
Created attachment 52680 [details] libacl/__acl_to_any_text.c with FS == 3
You should see the difference in between -D_FORTIFY_SOURCE=2 and -D_FORTIFY_SOURCE=3 in the attached pre-processed source files. $ gcc fs2.i -c -O2 -Werror $ gcc fs3.i -c -O2 -Werror In file included from /usr/include/string.h:535, from libacl/__acl_to_any_text.c:25: In function ‘strcpy’, inlined from ‘__acl_to_any_text’ at libacl/__acl_to_any_text.c:90:3: /usr/include/bits/string_fortified.h:79:10: error: ‘__builtin___strcpy_chk’ writing 1 or more bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 79 | return __builtin___strcpy_chk (__dest, __src, __glibc_objsize (__dest)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Note the warning exactly corresponds to the call that aborts during run-time.
OK, I have a representative reproducer, which TBH is not too different from the one you posted, just that it succeeds with __builtin_object_size and fails with __builtin_dynamic_object_size: struct __string_ext { char s_str[0]; }; typedef struct { int o_prefix; struct __string_ext i; } string_obj; #define SUFFIX ".suffix" string_obj * __acl_to_any_text (unsigned long n) { unsigned long off = 0; unsigned long size = sizeof SUFFIX; string_obj *obj = __builtin_malloc (sizeof (string_obj) + size); if (n == 0) __builtin_unreachable (); while (n-- != 0) { if (off + 1 > size - sizeof SUFFIX) { size <<= 1; string_obj *tmp = __builtin_realloc (obj, sizeof (string_obj) + size); if (!tmp) __builtin_unreachable (); obj = tmp; } obj->i.s_str[off++] = 'A'; } char *t = obj->i.s_str + off; __strcpy_chk (t, SUFFIX, __builtin_dynamic_object_size (t, 1)); return obj; } int main () { string_obj *s = __acl_to_any_text (32); __builtin_printf ("%zu: %s\n", __builtin_strlen (s->i.s_str), s->i.s_str); return 0; } $ gcc/cc1 -g -o test.s -quiet -Wall -O3 fs3.c fs3.c: In function ‘__acl_to_any_text’: fs3.c:40:3: warning: ‘__builtin___memcpy_chk’ writing 8 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=] 40 | __strcpy_chk (t, SUFFIX, __builtin_dynamic_object_size (t, 1)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The only reason why __builtin_object_size fails is because of the non-constant OFF. If that is removed, __builtin_object_size also returns the declared size of s_str, i.e. 0. The check for a traditionally declared trailing array ()i.e. a[0] or a[1]) seems to be broken for nested structs like the above. Change that to s_str[] (the struct then needs another member above) and it works fine.
(In reply to Siddhesh Poyarekar from comment #10) > OK, I have a representative reproducer, which TBH is not too different from > the one you posted, just that it succeeds with __builtin_object_size and > fails with __builtin_dynamic_object_size: *crashes* with __builtin_dynamic_object_size and doesn't with __builtin_object_size. Sorry, I realized I used "fails" and "succeeds" in two opposite and confusing contexts there ;)
Why is this P1? Is it a regression? If it works with __bos and doesn't work with __bdos, then it isn't a regression and if it doesn't work with __bos when using constant offset, it is a bug in acl, it might be compatible just with -D_FORTIFY_SOURCE=1, but not with =2 or =3.
It's not really a regression AFAICT, it's only more visible with __bdos because non-constant offsets don't stop it. Also the problem is only with subobjects (hence limited to _FORTIFY_SOURCE > 1 for strcpy) where the block in addr_object_size that is supposed to deal with flex arrays at the end doesn't quite do its job with nested structs. The same reproducer tweaked a bit will crash even for __builtin_object_size: struct __string_ext { char s_str[0]; }; typedef struct { int o_prefix; struct __string_ext i; } string_obj; #define SUFFIX ".suffix" string_obj * __acl_to_any_text (unsigned long n) { unsigned long off = 0; unsigned long size = sizeof SUFFIX; string_obj *obj = __builtin_malloc (sizeof (string_obj) + size); if (n == 0) __builtin_unreachable (); while (n-- != 0) { if (off + 1 > size - sizeof SUFFIX) { size <<= 1; string_obj *tmp = __builtin_realloc (obj, sizeof (string_obj) + size); if (!tmp) __builtin_unreachable (); obj = tmp; } obj->i.s_str[off++] = 'A'; } char *t = obj->i.s_str; __strcpy_chk (t, SUFFIX, __builtin_object_size (t, 1)); return obj; } int main () { string_obj *s = __acl_to_any_text (32); __builtin_printf ("%zu: %s\n", __builtin_strlen (s->i.s_str), s->i.s_str); return 0; } $ gcc/cc1 -g -o test.s -quiet -Wall -O3 fs3.c fs3.c: In function ‘__acl_to_any_text’: fs3.c:40:3: warning: ‘__builtin___memcpy_chk’ writing 8 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=] 40 | __strcpy_chk (t, SUFFIX, __builtin_object_size (t, 1)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thus I'd say fix up acl instead.
(In reply to Jakub Jelinek from comment #14) > Thus I'd say fix up acl instead. OK, closing this as WONTFIX then. __bos/__bdos has limited support for zero sized arrays; they are not recognized as flex arrays when in nested structs. Fixing up the struct to one with a proper flex array (i.e. without a dimension size, which also will need another member preceding it) should make this work correctly. Something like: struct __string_ext { char pad; char s_str[]; };
I think I might have hit the same thing in qt_readlink: https://bugs.gentoo.org/847145. Martin, did you chase down the Qt issue you had?
(In reply to Sam James from comment #16) > I think I might have hit the same thing in qt_readlink: > https://bugs.gentoo.org/847145. Martin, did you chase down the Qt issue you > had? Yes, for Qt5, one needs to following patch: https://build.opensuse.org/package/view_file/KDE:Qt:5.15/libqt5-qtbase/mitigate-FORTIFY_SOURCE-3.patch?expand=1
Thanks. I reported the Qt issue upstream at https://bugreports.qt.io/browse/QTBUG-103782. I've hit the ACL issue independently in Gentoo and will forward that upstream too (https://bugs.gentoo.org/847280).
(In reply to Sam James from comment #18) > Thanks. I reported the Qt issue upstream at > https://bugreports.qt.io/browse/QTBUG-103782. Note the Qt6 code is fine (uses reinterpret_cast<uintptr_t>(this)) and it's what we backported to Qt5. So the current Qt code seems fine.