Bug 99143 - Bad section alignment on AArch64
Summary: Bad section alignment on AArch64
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.2.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Depends on:
Reported: 2021-02-17 23:14 UTC by Will
Modified: 2021-03-01 17:05 UTC (History)
2 users (show)

See Also:
Target: aarch64-rtems
Known to work:
Known to fail:
Last reconfirmed: 2021-02-17 00:00:00


Note You need to log in before you can comment on or make changes to this bug.
Description Will 2021-02-17 23:14:04 UTC
aarch64-rtems6-gcc -v output:
Using built-in specs.
Target: aarch64-rtems6
Configured with: ../gnu-mirror-gcc-949e0ad/configure --prefix=/home/opticron/rtems-development/tools --bindir=/home/opticron/rtems-development/tools/bin --exec_prefix=/home/opticron/rtems-development/tools --includedir=/home/opticron/rtems-development/tools/include --libdir=/home/opticron/rtems-development/tools/lib --libexecdir=/home/opticron/rtems-development/tools/libexec --mandir=/home/opticron/rtems-development/tools/share/man --infodir=/home/opticron/rtems-development/tools/share/info --datadir=/home/opticron/rtems-development/tools/share --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=aarch64-rtems6 --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --verbose --with-newlib --disable-nls --without-included-gettext --disable-win32-registry --enable-version-specific-runtime-libs --disable-lto --enable-newlib-io-c99-formats --enable-newlib-iconv --enable-newlib-iconv-encodings=big5,cp775,cp850,cp852,cp855,cp866,euc_jp,euc_kr,euc_tw,iso_8859_1,iso_8859_10,iso_8859_11,iso_8859_13,iso_8859_14,iso_8859_15,iso_8859_2,iso_8859_3,iso_8859_4,iso_8859_5,iso_8859_6,iso_8859_7,iso_8859_8,iso_8859_9,iso_ir_111,koi8_r,koi8_ru,koi8_u,koi8_uni,ucs_2,ucs_2_internal,ucs_2be,ucs_2le,ucs_4,ucs_4_internal,ucs_4be,ucs_4le,us_ascii,utf_16,utf_16be,utf_16le,utf_8,win_1250,win_1251,win_1252,win_1253,win_1254,win_1255,win_1256,win_1257,win_1258 --enable-threads --disable-plugin --enable-libgomp --enable-languages=c,c++
Thread model: rtems
Supported LTO compression algorithms: zlib
gcc version 10.2.1 20200918 (RTEMS 6, RSB f5fc2bfabbd18f31901ffa6fc0e5b6b47874797c-modified, Newlib 749cbcc) (GCC)

Source file test-align.c:
char i;
char j[1];
char z[0];
unsigned long ai = _Alignof(i);
unsigned long aj = _Alignof(j);
unsigned long az = _Alignof(z);

Command line:
aarch64-rtems6-gcc -O2 -S -o - test-align.c -mabi=ilp32 -fdata-sections

Compiler output:
        .arch armv8-a
        .file   "test-align.c"
        .global az
        .global aj
        .global ai
        .global z
        .global j
        .global i
        .section        .bss.i,"aw",@nobits
        .type   i, %object
        .size   i, 1
        .zero   1
        .section        .bss.j,"aw",@nobits
        .align  3
        .type   j, %object
        .size   j, 1
        .zero   1
        .section        .bss.z,"aw",@nobits
        .align  3
        .type   z, %object
        .size   z, 0
        .section        .data.ai,"aw"
        .align  2
        .type   ai, %object
        .size   ai, 4
        .word   1
        .section        .data.aj,"aw"
        .align  2
        .type   aj, %object
        .size   aj, 4
        .word   1
        .section        .data.az,"aw"
        .align  2
        .type   az, %object
        .size   az, 4
        .word   1
        .ident  "GCC: (GNU) 10.2.1 20200918 (RTEMS 6, RSB f5fc2bfabbd18f31901ffa6fc0e5b6b47874797c-modified, Newlib 749cbcc)"

test-align.i from --save-temps:
# 1 "test-align.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "test-align.c"
char i;
char j[1];
char z[0];
unsigned long ai = _Alignof(i);
unsigned long aj = _Alignof(j);
unsigned long az = _Alignof(z);

When generating data sections for i, j, z for AArch64/ILP32 (also applies to LP64), GCC does not align data sections to anything less than 8 bytes, even when _Alignof() claims that the alignment of all three data elements should be 1 byte.
Comment 1 Andrew Pinski 2021-02-17 23:43:00 UTC
are you complaining that the objects j and z are aligned in the object file as 8 bytes?  Or that aj, az have the value of 1?
Both are valid things to do.

This is the code which does exactly that GCC:

/* Align definitions of arrays, unions and structures so that
   initializations and copies can be made more efficient.  This is not
   ABI-changing, so it only affects places where we can see the
   definition.  Increasing the alignment tends to introduce padding,
   so don't do this when optimizing for size/conserving stack space.  */
#define AARCH64_EXPAND_ALIGNMENT(COND, EXP, ALIGN)                      \
  (((COND) && ((ALIGN) < BITS_PER_WORD)                                 \
    && (TREE_CODE (EXP) == ARRAY_TYPE                                   \
        || TREE_CODE (EXP) == UNION_TYPE                                \
Comment 2 Will 2021-02-18 14:16:03 UTC
The issue is that these are not getting aligned down to the size of the type causing extra space in the linker sets. The above may be a bad reduced test case. Below is a test case that focuses on 4-byte alignment which is the primary issue on ILP32. When compiling with ARM gcc instead of AArch64 gcc, the int section is aligned at 4 bytes instead of 8.

An example closer to the real code:
#define RTEMS_USED __attribute__(( __used__ ))

#define RTEMS_SECTION( _section ) __attribute__(( __section__( _section ) ))

#define RTEMS_LINKER_SET_BEGIN( set ) \

#define RTEMS_LINKER_SET_END( set ) \

#define RTEMS_LINKER_ROSET( set, type ) \
   type const RTEMS_LINKER_SET_BEGIN( set )[ 0 ] \
   RTEMS_SECTION( ".rtemsroset." #set ".begin" ) RTEMS_USED; \
   type const RTEMS_LINKER_SET_END( set )[ 0 ] \
   RTEMS_SECTION( ".rtemsroset." #set ".end" ) RTEMS_USED

RTEMS_LINKER_ROSET( ll, long long );

# 1 "test-linkersets.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "test-linkersets.c"
# 17 "test-linkersets.c"
char const _Linker_set_c_begin[ 0 ] __attribute__(( __section__( ".rtemsroset." "c" ".begin" ) )) __attribute__(( __used__ )); char const _Linker_set_c_end[ 0 ] __attribute__(( __section__( ".rtemsroset." "c" ".end" ) )) __attribute__(( __used__ ));
int const _Linker_set_i_begin[ 0 ] __attribute__(( __section__( ".rtemsroset." "i" ".begin" ) )) __attribute__(( __used__ )); int const _Linker_set_i_end[ 0 ] __attribute__(( __section__( ".rtemsroset." "i" ".end" ) )) __attribute__(( __used__ ));
long long const _Linker_set_ll_begin[ 0 ] __attribute__(( __section__( ".rtemsroset." "ll" ".begin" ) )) __attribute__(( __used__ )); long long const _Linker_set_ll_end[ 0 ] __attribute__(( __section__( ".rtemsroset." "ll" ".end" ) )) __attribute__(( __used__ ));

aarch64-rtems6-gcc -O2 -S -mabi=ilp32 -o - test-linkersets.c
        .arch armv8-a
        .file   "test-linkersets.c"
        .global _Linker_set_ll_end
        .global _Linker_set_ll_begin
        .global _Linker_set_i_end
        .global _Linker_set_i_begin
        .global _Linker_set_c_end
        .global _Linker_set_c_begin
        .section        .rtemsroset.c.begin,"a"
        .align  3
        .type   _Linker_set_c_begin, %object
        .size   _Linker_set_c_begin, 0
        .section        .rtemsroset.c.end,"a"
        .align  3
        .type   _Linker_set_c_end, %object
        .size   _Linker_set_c_end, 0
        .section        .rtemsroset.i.begin,"a"
        .align  3
        .type   _Linker_set_i_begin, %object
        .size   _Linker_set_i_begin, 0
        .section        .rtemsroset.i.end,"a"
        .align  3
        .type   _Linker_set_i_end, %object
        .size   _Linker_set_i_end, 0
        .section        .rtemsroset.ll.begin,"a"
        .align  3
        .type   _Linker_set_ll_begin, %object
        .size   _Linker_set_ll_begin, 0
        .section        .rtemsroset.ll.end,"a"
        .align  3
        .type   _Linker_set_ll_end, %object
        .size   _Linker_set_ll_end, 0
        .ident  "GCC: (GNU) 10.2.1 20200918 (RTEMS 6, RSB f5fc2bfabbd18f31901ffa6fc0e5b6b47874797c-modified, Newlib 749cbcc)"
Comment 3 Andreas Schwab 2021-02-18 14:46:45 UTC
If you want to save space you should use -Os, not -O2.
Comment 4 Will 2021-02-18 15:09:46 UTC
That's great to know and I may use it as a stopgap/workaround, but it's not so much about saving space as preserving the packing behavior of sections that seems to work as expected on other architectures (even 64bit ones).
Comment 5 Andreas Schwab 2021-02-18 15:24:08 UTC
That's still sacrificing speed for space.
Comment 6 Jakub Jelinek 2021-02-18 15:28:25 UTC
Those assumptions are just wrong.
As e.g. the kernel people have been told, the only reliable way to ensure no increases of alignment for speed happens is to use __attribute__((aligned (...))) on the decls that should not be aligned more than stated (you can use alignof (...) in the aligned attribute argument.
Comment 8 Richard Earnshaw 2021-03-01 17:05:49 UTC
There is nothing in C (or other languages for that matter) that will guarantee that independent objects will be packed without space around them.  

So the premise of this bug report is simply invalid.