Bug 28865 - Structures with a flexible arrray member have wrong .size
Summary: Structures with a flexible arrray member have wrong .size
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.2.0
: P3 normal
Target Milestone: 4.7.4
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 39383 56880 59719 59741 (view as bug list)
Depends on:
Blocks: 57180
  Show dependency treegraph
 
Reported: 2006-08-27 21:53 UTC by Marcin 'Qrczak' Kowalczyk
Modified: 2014-01-16 12:19 UTC (History)
10 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 3.0.4, 3.2.3, 3.3.3, 3.4.0, 4.0.0, 4.1.0, 4.2.0, 4.7.1, 4.8.0
Last reconfirmed: 2006-08-28 03:08:51


Attachments
Fix gcc's emission of assembler directives for a variable length structure (1.34 KB, patch)
2014-01-10 16:53 UTC, Nick Clifton
Details | Diff
Revised patch, with testcases (2.00 KB, patch)
2014-01-13 17:47 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marcin 'Qrczak' Kowalczyk 2006-08-27 21:53:24 UTC
For example this C:
struct {int x; int y[];} obj = {1, {2, 3}};

produces this assembly:

.globl obj
        .data
        .align 4
        .type   obj, @object
        .size   obj, 4
obj:
        .long   1
        .long   2
        .long   3

.size should include the flexible array component.
Comment 1 Marcin 'Qrczak' Kowalczyk 2006-08-27 22:15:01 UTC
A question: when this is fixed, what should be C-level sizeof obj?

I hope it would include the flexible array component. This would allow to detect in autoconf whether this bug is fixed, and would be consistent with toplevel arrays whose sizes are derived from their initializer.
Comment 2 Andrew Pinski 2006-08-28 03:08:51 UTC
Confirmed, not a regression.
Comment 3 Lauro Ramos Venancio 2007-12-03 23:42:43 UTC
I think I'm facing a related problem. GCC is emitting unaligned array elements. 

struct ccstm {
	int32_t val;
	const char descr[];
};

static const struct ccstm canon_d30custom[] = {
  { 1, "Long exposure noise reduction" },
  { 2, "Shutter/AE lock buttons" },
  { 3, "Mirror lockup" }
};

produces this assembly for arm-linux-gnueabi:

.align 2
canon_d30custom:
.word 1
.ascii "Long exposure noise reduction\000"
.word 2
.ascii "Shutter/AE lock buttons\000"
.word 3
.ascii "Mirror lockup\000"

Note that the second and the third array element are unaligned.
Comment 4 Andrew Pinski 2013-04-08 21:57:03 UTC
*** Bug 56880 has been marked as a duplicate of this bug. ***
Comment 5 Andrew Pinski 2013-04-08 21:57:28 UTC
*** Bug 39383 has been marked as a duplicate of this bug. ***
Comment 6 H.J. Lu 2013-04-08 22:39:02 UTC
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2009-04/msg01807.html
Comment 7 Fredrick 2013-04-09 18:10:34 UTC
HJ,

Thanks for pointing the patch.

The patch works. I tested it on x86-64. 
Could this patch be integrated into the mainline GCC?

Thanks,
Fredrick
Comment 8 Andreas Schwab 2013-04-20 09:54:32 UTC
This is causing wrong code for libnss_files.so (_nss_files_init from nss/files-init.c, only called by nscd) on powerpc due to the use of -fsection-anchors, since the anchor offsets are computed from the wrong size information.
Comment 9 Alan Modra 2013-05-04 14:34:39 UTC
From what I see on current mainline for a testcase based on glibc/nss/nss_files/files-init.c the var_decl size and the type size agree and are correct.  What causes a problem with -fsection-anchors is that the actual data emitted by output_constant() is wrong.

struct traced_file
{
  long pad;
  char fname[];
};

#define TF(id, filename, ...)					\
union								\
{								\
  struct traced_file file;					\
  char buf[sizeof (struct traced_file) + sizeof (filename)];	\
} id##_traced_file =						\
  {								\
    .file =							\
    {								\
      .fname = filename						\
    }								\
  }

TF (pwd, "/etc/passwd");
TF (grp, "/etc/group");

Gives me

	.file	"unioninit.c"
	.globl grp_traced_file
	.globl pwd_traced_file
	.section	".data"
	.align 2
.LANCB0:
	.org .LANCB0+0
	.type	grp_traced_file, @object
	.size	grp_traced_file, 16
grp_traced_file:
	.zero	4
	.string	"/etc/group"
	.zero	12
	.org .LANCB0+16
	.type	pwd_traced_file, @object
	.size	pwd_traced_file, 16
pwd_traced_file:
	.zero	4
	.string	"/etc/passwd"
	.zero	12
	.ident	"GCC: (GNU) 4.9.0 20130502 (experimental)"
	.section	.note.GNU-stack,"",@progbits

Those ".zero 12" lines shouldn't be there.
Comment 10 Alan Modra 2013-05-04 14:39:34 UTC
Incidentall, I expect the patch referred to in comment #6 will ICE with tree-checking on due to CONSTRUCTOR nodes not having the required fields for DECL_SIZE_UNIT.
Comment 11 Jakub Jelinek 2014-01-09 19:34:33 UTC
*** Bug 59719 has been marked as a duplicate of this bug. ***
Comment 12 Jakub Jelinek 2014-01-09 19:34:47 UTC
*** Bug 59741 has been marked as a duplicate of this bug. ***
Comment 13 Jakub Jelinek 2014-01-09 19:35:59 UTC
Runtime testcase that presumably fails on -fsection-anchors targets:
struct A { int a; char b[]; };
union B { struct A a; char b[sizeof (struct A) + 31]; };
union B b = { { 1, "123456789012345678901234567890" } };
union B c = { { 2, "123456789012345678901234567890" } };

__attribute__((noinline, noclone)) void
foo (int *x[2])
{
  x[0] = &b.a.a;
  x[1] = &c.a.a;
}

int
main ()
{
  int *x[2];
  foo (x);
  if (*x[0] != 1 || *x[1] != 2)
    __builtin_abort ();
  return 0;
}
Comment 14 Peter Bergner 2014-01-09 19:42:43 UTC
(In reply to Jakub Jelinek from comment #13)
> Runtime testcase that presumably fails on -fsection-anchors targets:
> struct A { int a; char b[]; };
> union B { struct A a; char b[sizeof (struct A) + 31]; };
> union B b = { { 1, "123456789012345678901234567890" } };
> union B c = { { 2, "123456789012345678901234567890" } };
...

I hit the abort on powerpc64-linux with a mainline from December (only mainline build handy at the moment) and we are a -fsection-anchors target.
Comment 15 Nick Clifton 2014-01-10 16:53:53 UTC
Created attachment 31802 [details]
Fix gcc's emission of assembler directives for a variable length structure

Hi Guys,

  I think that the problem is the output_constant function in varasm.c which is given a target size, but which can emit more bytes than expected if the expression is a variable length structure.

  This patch updates output_constant so that it returns the number of bytes that it really did emit, and it updates a couple of places where output_constant is used so that this corrected size is taken into account.

  Tested without regressions on an i686-pc-linux-gnu toolchain and an aarch64-elf toolchain.  Does anyone have any objections to my submitting the patch to gcc-patches ?

Cheers
  Nick
Comment 16 Alan Modra 2014-01-10 22:51:52 UTC
Hi Nick, that patch looks exactly like my first attempt to fix this bug.  I forget the details now but I'm fairly certain it wasn't a complete fix, which is why I didn't submit my patch..

Note that the underlying bug here doesn't just cause a wrong .size to be emitted, but wrongly sized initialization data.  On -fsection-anchors targets that results in seriously wrong code.  Variables following a too-large initializer are located at the wrong address..
See http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00246.html for a patch that makes the assembler complain loudly if gcc emits wrong code.
Comment 17 Alan Modra 2014-01-10 23:13:13 UTC
I believe Eric's comment http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00179.html points at the correct fix, but it's a bit messy.  You need to recursively descend both "decl" and "init" in code like c-decl.c:finish_decl, updating "decl" component sizes for any variable length arrays you find, rather than handling just a top level variable length array as we do currently.  Then repeat the exercise for other language versions of finish_decl..
Comment 18 Jakub Jelinek 2014-01-11 08:14:38 UTC
I disagree.  DECL_SIZE and .size are IMHO correct here at this point, and component sizes (FIELD_DECL) can't match, because we shouldn't be changing the type of the decl to some other type just because it got an initializer.  The bug is IMHO really just in miscounting the size of the padding of the initializer and thus emitting into the data/rodata etc. section more bytes of data from what .size/arrays/-fsection-anchors assumes.  Outside of arrays on non-fsection-anchors target this is just ugly (except where you e.g. put the structs into named sections and use it as a kind of array say through beginning/end of the named section), for -fsection-anchors it is of course a wrong-code (well, wrong-data) bug.
Comment 19 Alan Modra 2014-01-13 11:43:41 UTC
Jakub, you're correct.  I should have read the standard on flexible array members before poking at this bug last year.  Nick's patch is looking good to me.

ISO/IEC 9899:1999 is quite clear that the answer to comment #2 is that sizeof does *not* include the flexible array component (but it does include any padding needed to align the start of the flexible array).  I think it is reasonable to do the same for .size and not include the flexible array component there too.

The testcase (which is invalid code IMO) in comment #3 hits this assert in varasm.c:

  /* Advance to offset of this element.
     Note no alignment needed in an array, since that is guaranteed
     if each element has the proper size.  */
  if ((local->field != NULL_TREE || local->index != NULL_TREE)
      && fieldpos != local->total_bytes)
    {
      gcc_assert (fieldpos >= local->total_bytes);
Comment 20 Jakub Jelinek 2014-01-13 11:52:40 UTC
(In reply to Alan Modra from comment #19)
> Jakub, you're correct.  I should have read the standard on flexible array
> members before poking at this bug last year.  Nick's patch is looking good
> to me.
> 
> ISO/IEC 9899:1999 is quite clear that the answer to comment #2 is that
> sizeof does *not* include the flexible array component (but it does include
> any padding needed to align the start of the flexible array).  I think it is
> reasonable to do the same for .size and not include the flexible array
> component there too.

Well, sizeof is clear, but .size doesn't need to be the same thing, IMHO .size really should be DECL_SIZE of the decl and we do adjust that for the flexible array member in there.  If .size would be just e.g. TYPE_SIZE, then there could be e.g. problems with copy relocations.
> 
> The testcase (which is invalid code IMO) in comment #3 hits this assert in
> varasm.c:

The #c3 is invalid, I agree, how can you have an array if each array entry has a different size?  I mean, the C standard allows flexible array members only at the toplevel and not initialized, as GNU extension we allow it to be initialized and allow it even in some cases where it is not toplevel, but where it is still reasonable (e.g. the case where it is at the end of struct which is inside a union is reasonable, so would be placing it at the end of another structure, but placing it in an array is IMHO something that should be rejected by the FE).
CCing Joseph on it.
Comment 21 joseph@codesourcery.com 2014-01-13 16:03:18 UTC
The reason for laxity about flexible array member constraints is existing 
code violating them, as in the glibc header code quoted in 
<http://gcc.gnu.org/ml/gcc-patches/2002-08/msg01149.html>.  I don't know 
whether there is existing such code for the array case or not.
Comment 22 Jakub Jelinek 2014-01-13 16:46:54 UTC
But the glibc headers case you're mentioning wasn't initializing the flexible array members, right?  (Or even initialization with {} initializer is fine I guess).  I mean, while C doesn't allow it, if you don't initialize the flexible array member followed by something else, it should still work fine as if it was a zero-sized array.  But even in the struct A { struct B { int a; char b[]; }; int c; }; case, I'd say we should error when trying to initialize b to something non-empty, because we shouldn't be changing the types (thus offsets in the type fields, type sizes etc.) based on the initializer, only DECL_SIZE can.
So IMHO we should accept:
struct A { int a; char b[]; };
struct B { struct A a; int c; } b;
struct A p[24];
struct B c = { { 5, {} }, 6 };
struct A q[2] = { { 5, {} }, { 6, {} } };
but reject:
struct B d = { { 1, { 2 } }, 3 };
struct B e = { { 2, "abc" }, 4 };
struct A r[2] = { { 5, "a" }, { 6, "b" } };
Comment 23 Nick Clifton 2014-01-13 17:47:09 UTC
Created attachment 31823 [details]
Revised patch, with testcases

Hi Guys,

I have uploaded a revised patch with two changes:

  1. I have removed the assertion from
     output_constructor_regular_field() and changed the code so that
     extra zeros are emitted only if fieldpos is more than the current
     byte total.  I made this change because I feel that we should avoid
     ICEs even on invalid code.

  2. I have added two tests to the testsuite, based on comments 3 and
     13.

  Still no regressions with an i686-pc-linux-gnu toolchain and an
  aarch64-elf toolchain.
  
  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2014-01-13  Nick Clifton  <nickc@redhat.com>

	PR middle-end/28865
	* varasm.c (output_consant): Return the number of bytes actually
	emitted.
	(output_constructor_array_range): Update the field size with the
	number of bytes emitted by output_constant.
	(output_constructor_regular_field): Likewise.  Also do not
	complain if the total number of bytes emitted is now greater
	than the expected fieldpos.
	* output.h (output_constant): Update prototype and descriptive
	comment.

gcc/testsuite/ChangeLog
2014-01-13  Nick Clifton  <nickc@redhat.com>

	PR middle-end/28865
	* gcc.c-torture/compile/pr28865.c: New.
	* gcc.c-torture/execute/pr28865.c: New.
Comment 24 Alan Modra 2014-01-14 11:58:00 UTC
Nick's latest patch passes bootstrap and regression testing powerpc64-linux.
Comment 25 joseph@codesourcery.com 2014-01-14 19:51:21 UTC
On Mon, 13 Jan 2014, jakub at gcc dot gnu.org wrote:

> But the glibc headers case you're mentioning wasn't initializing the flexible
> array members, right?  (Or even initialization with {} initializer is fine I

I don't have details of exactly what uses of flexible array members they 
were making beyond those permitted by ISO C.  I guess the general point is 
to be careful about disallowing such uses because it might break existing 
code (so allowing plenty of time before a release for distribution 
rebuilds to catch any problems with such a change, for example).
Comment 26 Nick Clifton 2014-01-16 12:18:20 UTC
Author: nickc
Date: Thu Jan 16 12:17:48 2014
New Revision: 206661

URL: http://gcc.gnu.org/viewcvs?rev=206661&root=gcc&view=rev
Log:
PR middle-end/28865

	* varasm.c (output_constant): Return the number of bytes actually
	emitted.
	(output_constructor_array_range): Update the field size with the
	number of bytes emitted by output_constant.
	(output_constructor_regular_field): Likewise.  Also do not
 	complain if the total number of bytes emitted is now greater
	than the expected fieldpos.
	* output.h (output_constant): Update prototype and descriptive
	comment.

	* gcc.c-torture/compile/pr28865.c: New.
	* gcc.c-torture/execute/pr28865.c: New.

Added:
    trunk/gcc/testsuite/gcc.c-torture/compile/pr28865.c
    trunk/gcc/testsuite/gcc.c-torture/execute/pr28865.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/output.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/varasm.c
Comment 27 Nick Clifton 2014-01-16 12:19:39 UTC
The patch has now been approved and checked in:

http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00812.html