| Summary: | incorrect memory access in optimization with flexible array member | ||
|---|---|---|---|
| Product: | gcc | Reporter: | Pádraig Brady <P> |
| Component: | middle-end | Assignee: | Not yet assigned to anyone <unassigned> |
| Status: | UNCONFIRMED --- | ||
| Severity: | normal | CC: | egallager, fw, jsm-csl, pinskia, rguenth, sjames |
| Priority: | P3 | ||
| Version: | 5.1.1 | ||
| Target Milestone: | --- | ||
| See Also: | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109585 | ||
| Host: | Target: | ||
| Build: | Known to work: | ||
| Known to fail: | Last reconfirmed: | ||
| Bug Depends on: | |||
| Bug Blocks: | 69698 | ||
| Attachments: |
summary code (does not reproduce issue)
disassembly of problematic mem access disassembly of forced good mem access reproducer |
||
|
Description
Pádraig Brady
2015-06-25 01:25:07 UTC
Created attachment 35849 [details]
summary code (does not reproduce issue)
Created attachment 35850 [details]
disassembly of problematic mem access
Created attachment 35851 [details]
disassembly of forced good mem access
I should note that I worked around the issue by increasing the allocation for the structure on the heap up to a multiple of alignof(the_struct). See: http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=49078a78 (In reply to Pádraig Brady from comment #4) > I should note that I worked around the issue by increasing the allocation > for the structure on the heap up to a multiple of alignof(the_struct). See: > http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=49078a78 That is not just a workaround but a correct patch. The size of the memory allocated needs to be a multiply of the alignment of the struct to be valid. The reason why it works for you with your reduced testcase is because the alignment of the struct is 1. More to the point this might actually be a -fsanitize=address bug so reopening for that issue. By the way we know the alignment of the struct is 4 which means loading a 4 byte would be valid. Created attachment 35852 [details]
reproducer
Yes, consider
struct X { int n; char x[1]; };
which has sizeof(X) == 8 unless you use __attribute__((packed)) (in which case
alignment also gets dropped down to 1).
I'm not understanding completely TBH. Are flexible array members not special? Should the optimizations be restricted on access through the flexible array, because I presume most/all existing allocation code is not considering this alignment/padding issue. I certainly didn't notice any examples when looking into a workaround which I came up with independently. For my reference there are some notes RE GCC's divergence from C99 re padding and flexi arrays at: https://sites.google.com/site/embeddedmonologue/home/c-programming/data-alig On Thu, 25 Jun 2015, P at draigBrady dot com wrote: > I'm not understanding completely TBH. Are flexible array members not special? > Should the optimizations be restricted on access through the flexible array, > because I presume most/all existing allocation code is not considering this > alignment/padding issue. I certainly didn't notice any examples when looking > into a workaround which I came up with independently. For my reference there > are some notes RE GCC's divergence from C99 re padding and flexi arrays at: > https://sites.google.com/site/embeddedmonologue/home/c-programming/data-alig That page doesn't exist - I assume you mean: https://sites.google.com/site/embeddedmonologue/home/c-programming/data-alignment-structure-padding-and-flexible-array-member That page is over ten years out of date - it's quoting the wording in C99 as it was before it was corrected by TC2 (published 2004-11-15). You should look at the post-TC2 wording rather than the old wording with various defects in it. I think this is a GCC bug.
Object sizes do not have to be a multiple of the object alignment. Consider this example:
#include <stdalign.h>
#include <stdio.h>
#include <stdlib.h>
_Alignas (128) unsigned v0 = 128;
_Alignas (8) unsigned v1 = 8;
_Alignas (8) unsigned v2 = 8;
_Alignas (8) unsigned v3 = 8;
_Alignas (4) unsigned v4 = 4;
_Alignas (4) unsigned v5 = 4;
_Alignas (8) unsigned v6 = 8;
_Alignas (8) unsigned v7 = 8;
_Alignas (8) unsigned v8 = 8;
_Alignas (4) unsigned v9 = 4;
_Alignas (4) unsigned v10 = 4;
static int
cmp(const void *a, const void *b)
{
unsigned *const *a1 = a;
unsigned *const *b1 = b;
if (*a1 < *b1)
return -1;
if (*a1 > *b1)
return 1;
return 0;
}
int
main()
{
unsigned *p[] = {&v1, &v2, &v3, &v4, &v5, &v6, &v7, &v8, &v9, &v10};
qsort (p, 10, sizeof (p[0]), cmp);
for (int i = 0; i < 9; ++i) {
printf ("%d: alignment %u, size %zd\n",
i, *p[i], (p[i + 1] - p[i]) * sizeof (*p[i]));
}
}
With gcc-5.3.1-6.fc23.x86_64, it prints this for me:
0: alignment 4, size 4
1: alignment 4, size 4
2: alignment 8, size 8
3: alignment 8, size 8
4: alignment 8, size 4
5: alignment 4, size 4
6: alignment 4, size 8
7: alignment 8, size 8
8: alignment 8, size 8
So the assumption that objects are at least as large as their alignment appears to be refuted by code generated by GCC.
Here's a new test case. It prints for me (without Address Sanitizer):
count 2, align 4, minimum size 10, struct size 12, actual size 12
count 5, align 4, minimum size 13, struct size 16, actual size 20
count 0, align 8, minimum size 8, struct size 8, actual size 8
count 3, align 4, minimum size 11, struct size 12, actual size 24
* count 0, align 16, minimum size 8, struct size 16, actual size 8
count 8, align 4, minimum size 16, struct size 16, actual size 16
count 4, align 4, minimum size 12, struct size 12, actual size 12
count 2, align 4, minimum size 10, struct size 12, actual size 12
* count 2, align 8, minimum size 10, struct size 16, actual size 12
count 6, align 4, minimum size 14, struct size 16, actual size 16
count 7, align 4, minimum size 15, struct size 16, actual size 16
count 3, align 4, minimum size 11, struct size 12, actual size 12
count 3, align 4, minimum size 11, struct size 12, actual size 12
count 6, align 4, minimum size 14, struct size 16, actual size 44
* count 5, align 32, minimum size 13, struct size 32, actual size 16
count 8, align 4, minimum size 16, struct size 16, actual size 16
count 9, align 4, minimum size 17, struct size 20, actual size 20
count 7, align 4, minimum size 15, struct size 16, actual size 16
count 1, align 4, minimum size 9, struct size 12, actual size 12
* count 1, align 8, minimum size 9, struct size 16, actual size 12
I believe this shows that GCC has some bug in this area. Whether it's the over-reads (but over-reads can be fine here because they cannot trap, and GCC knows that they won't introduce observable data races), the object allocation in the .data section (again, could be harmless), or the Address Sanitizer report, I'm not sure.
#include <stdalign.h>
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
struct flexible
{
int count;
int align;
char bytes[];
};
#define ARGS_0
#define ARGS_1 1
#define ARGS_2 1, 2
#define ARGS_3 1, 2, 3
#define ARGS_4 1, 2, 3, 4
#define ARGS_5 1, 2, 3, 4, 5
#define ARGS_6 1, 2, 3, 4, 5, 6
#define ARGS_7 1, 2, 3, 4, 5, 6, 7
#define ARGS_8 1, 2, 3, 4, 5, 6, 7, 8
#define ARGS_9 1, 2, 3, 4, 5, 6, 7, 8, 9
#define DECL(name, count, align) \
_Alignas (align) struct flexible name = {count, align,{ ARGS_##count }}
DECL (v4, 4, 4);
DECL (v1, 1, 8);
DECL (v7, 1, 4);
DECL (v17, 7, 4);
DECL (v15, 9, 4);
DECL (v16, 8, 4);
DECL (v11, 5, 32);
DECL (v12, 6, 4);
DECL (v5, 3, 4);
DECL (v9, 3, 4);
DECL (v13, 7, 4);
DECL (v18, 6, 4);
DECL (v2, 2, 8);
DECL (v8, 2, 4);
DECL (v10, 4, 4);
DECL (v14, 8, 4);
DECL (v0, 0, 16);
DECL (v3, 3, 4);
DECL (v5a, 0, 8);
DECL (v19, 5, 4);
DECL (v6, 2, 4);
enum { count = 21 };
static int
cmp(const void *a, const void *b)
{
struct flexible *const *a1 = a;
struct flexible *const *b1 = b;
if (*a1 < *b1)
return -1;
if (*a1 > *b1)
return 1;
return 0;
}
int
main()
{
struct flexible *p[count]
= {&v0, &v1, &v2, &v3, &v4, &v5, &v5a, &v6, &v7, &v8, &v9, &v10,
&v11, &v12, &v13, &v14, &v15, &v16, &v17, &v18, &v19};
qsort (p, count, sizeof (p[0]), cmp);
for (int i = 0; i < count - 1; ++i)
{
size_t min_size = offsetof (struct flexible, bytes) + p[i]->count;
size_t align_mask = p[i]->align - 1;
/* Struct size is the size that a struct with the requested
length of the flexible array member would have. */
size_t struct_size = (min_size + align_mask) & ~align_mask;
/* The actual size is the offset between this and the next
object in the data section. (This can be an over-estimate if
other objects not listed above are placed between the listed
objects.) */
size_t actual_size = (p[i + 1] - p[i]) * sizeof (*p[i]);
/* The lines marked with * have an object whose struct size
exceeds the object size. If GCC assumes that objects always
have their struct size allocated, this leads to an
out-of-bounds acccess. */
printf ("%c count %d, align %d, minimum size %zu, struct size %zu, actual size %zu\n",
struct_size > actual_size ? '*' : ' ',
p[i]->count, p[i]->align, min_size, struct_size, actual_size);
}
}
(In reply to joseph@codesourcery.com from comment #10) > On Thu, 25 Jun 2015, P at draigBrady dot com wrote: > > > I'm not understanding completely TBH. Are flexible array members not special? > > Should the optimizations be restricted on access through the flexible array, > > because I presume most/all existing allocation code is not considering this > > alignment/padding issue. I certainly didn't notice any examples when looking > > into a workaround which I came up with independently. For my reference there > > are some notes RE GCC's divergence from C99 re padding and flexi arrays at: > > https://sites.google.com/site/embeddedmonologue/home/c-programming/data-alig > > That page doesn't exist - I assume you mean: > https://sites.google.com/site/embeddedmonologue/home/c-programming/data- > alignment-structure-padding-and-flexible-array-member > > That page is over ten years out of date - it's quoting the wording in C99 > as it was before it was corrected by TC2 (published 2004-11-15). You > should look at the post-TC2 wording rather than the old wording with > various defects in it. Could you quote the post-TC2 wording here for us so we don't have to go looking? That wording is long including several examples. You can see it in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf subclause 6.7.2.1 (C99 + TC1 + TC2 + TC3). |