GCC generates an XMM instruction with a memory operand not 16-byte aligned as it should be (i think), this generates a segmentation fault when run. Input file: (test.c) --------------------------------------------- int iarr[64]; int iint = 0; int main() { int i; for(i=0;i<64;i++) { iarr[i] = -2; } return 0; } --------------------------------------------- Output of gcc -v: Using built-in specs. Target: i686-pc-cygwin Configured with: ../source-4.3.1/configure --prefix=/usr/src/gcc/prefix-4.3.1 --enable-languages=c,c++ --disable-nls --without-included-gettext --enable-version-specific-runtime-libs --enable-threads=posix --disable-win32-registry Thread model: posix gcc version 4.3.1 (GCC) Command used to compile: /usr/src/gcc/prefix-4.3.1/bin/gcc test.c -o test.exe -march=core2 -O3 CPU: Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz Offending instruction: 'movdqa %xmm0,0x403024'
Please look at the asm dump (add -S to compile flags) for following lines: .globl iint .bss .align 4 .type iint, @object .size iint, 4 iint: .zero 4 .comm iarr,256,32 <<<< this one alternatively, you can use -fno-common: .globl iarr .align 32 <<<< alignment .type iarr, @object .size iarr, 256 iarr: .zero 256 In above dump (linux target) iarr is aligned to 32bytes. Does cygwin aligns .comm variables to multiplies of 16 bytes?
with -S added: .globl _iint .bss .align 4 _iint: .space 4 .comm _iarr, 256 # 256 .section .rdata,"dr" .align 16 with -S -fno-common added: .globl _iint .bss .align 4 _iint: .space 4 .globl _iarr .align 32 _iarr: .space 256 .section .rdata,"dr" .align 16 Note: with just -fno-common added the resulting test.exe runs without segfaulting
Subject: Re: [cygwin] Invalid alignment for SSE store to .comm data generated with -O3 The the 3 argument version of .comm is only supported by the ELF assembler so it would be rejected by the PE assembler if gcc were to emit it.
Subject: Re: [cygwin] Invalid alignment for SSE store to .comm data generated with -O3 Also, this is not Cygwin-specific as far as I can tell, more like PE-specific since it affects MinGW as well.
Confirmed and added maintainer CC.
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 It seems to me the issue is that prior to 2007-11-05[1], the PE assembler could not set section alignment flags correctly so .bss was simply hardcoded to 2**4 alignment which meant that things happened to work out. But now with the current assembler .bss gets the default 2**2 because there's no way to communicate the required alignment of common symbols in the PECOFF object format that I can tell. So one workaround would be that when emitting .comm to also emit an explicit alignment of .bss so that the assembler sees it and sets the section flags, e.g. .comm _iarr, 256 # 256 .bss .align 16 I tried the above and it seems to do the right thing with iarr ending up 16-aligned. But it seems like a dirty way of solving the problem, though having spurious alignment faults is not exactly pretty either. Another route would be to set the .bss minimum back to 2**4 again. [1] http://sourceware.org/ml/binutils/2007-11/msg00027.html
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 > Another route would be to set the .bss minimum back to 2**4 again. Actually that's not really great either because it doesn't do anything to address the more general form of the problem, e.g. int foo[16] __attribute__((aligned (32))); would still end up at the mercy of whatever the .bss default is, i.e. that alignment requirement isn't recorded anywhere.
Created attachment 16303 [details] Implement alignment for non-local commons
Hi Brian, Please could you try out the uploaded patch which is an implementation of your idea to add an extra alignment directive when emitting commons. It seems to work for the test case you gave, but I have not yet run it through a full rebuild and retest cycle... Cheers Nick
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 One thing I was unsure about is this method switches to the .bss section without switching back to .text (or whatever) afterwards. As far as I can tell the common symbols are always emitted at the end of the file after everything else so this should be ok, but is there any chance of this function being called from anywhere else? (It's a shame the PE assembler doesn't have anything like .pushsection/.popsection.)
I tried the attached patch, but gcc failed to build for me on cygwin with it.
same for me on mingw, bootstrap fails with patch. I get: /e/mingw_build/build-gcc/./gcc/xgcc -B/e/mingw_build/build-gcc/./gcc/ -L/e/mingw_build/build-gcc/mingw32/winsup/mingw -L/e/mingw_build/build -gcc/mingw32/winsup/w32api/lib -isystem /e/mingw_build/gcc-4.4-20080926-patched-mingw/winsup/mingw/include -isystem /e/mingw_build/gcc-4.4-2 0080926-patched-mingw/winsup/w32api/include -B/mingw/mingw32/bin/ -B/mingw/mingw32/lib/ -isystem /mingw/mingw32/include -isystem /mingw/ming w32/sys-include -O2 -D__USE_MINGW_ACCESS -O2 -I../../../mingw_build/gcc-4.4-20080926-patched-mingw/gcc/../winsup/w32api/include -O2 -D__USE_ MINGW_ACCESS -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wcast-qual -Wold-style-definition -isystem ./inc lude -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../.././gcc -I../../../../mingw_build/gcc-4.4-20080926-patch ed-mingw/libgcc -I../../../../mingw_build/gcc-4.4-20080926-patched-mingw/libgcc/. -I../../../../mingw_build/gcc-4.4-20080926-patched-mingw/l ibgcc/../gcc -I../../../../mingw_build/gcc-4.4-20080926-patched-mingw/libgcc/../include -DHAVE_CC_TLS -o _gcov.o -MT _gcov.o -MD -MP -MF _g cov.dep -DL_gcov -c ../../../../mingw_build/gcc-4.4-20080926-patched-mingw/libgcc/../gcc/libgcov.c C:/ACCOUN~1/Seb/LOCALS~1/Tempcc06q9XS.s: Assembler messages: C:/ACCOUN~1/Seb/LOCALS~1/Tempcc06q9XS.s:5782: Error: can't resolve `.bss' {.bss section} - `Ltext0' {.text section}
Created attachment 16425 [details] Revised patch with correct section switching
Hi Guys, I am not able to reproduce the build problems that were reported with the first version of my patch, but then again I do not have a native cygwin build system or a system in which I can bootstrap mingw32. I think that Brian may well be right however in that the patch is behaving badly when it manually switches into the .bss section. I have uploaded a revised version of the patch which uses the correct gcc function to perform the section switch, so please can anyone who is interested give this new patch a run. Cheers Nick
I also got the error on the first patch, gcc 4.4 from svn, binutils 2.18.91.20080917 I still have problems with 2.19 binutils snapshots (unable to correctly create and link dll) unfortunately the current gcc svn does not compile for mingw because of: build/i686-pc-mingw32/libstdc++-v3/include/bits/basic_string.h:2689: error: no matching function for call to '__to_xstring(int (*)(wchar_t*, const wchar_t*, char*), const int&, const wchar_t [4], long double&)' I may need to fill another bug There is no such code applicable to 4.2 branch :( Please take a look at the attached patch, I've picked it from a mailing list, it makes possible align with local variables (static) I don't known if your patches makes it unneeded
Created attachment 16426 [details] lcomm + alignment
with both patches I can achieve align 16 align > 16 on globals still fails Local Aligned 16: 0 Local Aligned 32: 0 Global Aligned 16: 0 Global Aligned 32: 16 the program is: #include <stdio.h> #include <stdlib.h> #include <inttypes.h> static int local_16[8] __attribute__ ((aligned (16))); static int local_32[8] __attribute__ ((aligned (32))); int global_16[8] __attribute__ ((aligned (16))); int global_32[8] __attribute__ ((aligned (32))); int main(void) { printf("Local Aligned 16: %d\n", (intptr_t) local_16 % 16); printf("Local Aligned 32: %d\n", (intptr_t) local_32 % 32); printf("Global Aligned 16: %d\n", (intptr_t) global_16 % 16); printf("Global Aligned 32: %d\n", (intptr_t) global_32 % 32); return 0; } did I miss something? (perhaps 16 byte alignment it's enough for sse)
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 The __to_xstring error is PR37522. You should still be able to bootstrap with --enable-languages=c for the purposes of testing this patch, though. Your testcase that uses printf with addr % 16 might not be showing you what you think it is because the compiler is smart enough to optimize away that computation at compile time since it knows the variable was declared with the given alignment, even though the actual address assigned by the linker mightn't be aligned properly. In my testing I had to use volatile to prevent that compile-time optimization. I also forgot to mention that I don't have sse2 capable hardware so I have no way of testing any of this other than by hand inspection.
Created attachment 16427 [details] Nick's aligned common testcase
align testcase looks ok, but anyway I'm mainly interested to have code aligned to 16. volatile around variables is not enough in my test program. Nick's testcase is ok even on (local-only align) patched gcc 4.2 I've solved to_xstring problem by using vsnwprintf instead, but current trunk produces buggy ffmpeg executable so many tests are crashing aligned data looks ok, at least the variables that were unaligned when it was crashing I'm still not so sure about the optimization and the alignment, this code: #include <stdio.h> #include <stdlib.h> #include <inttypes.h> static volatile int local_16[8] __attribute__ ((aligned (16))); static volatile int local_32[8] __attribute__ ((aligned (32))); int volatile global_16[8] __attribute__ ((aligned (16))); int volatile global_32[8] __attribute__ ((aligned (32))); int main(void) { volatile double diff16, diff32; printf("Local Aligned 16: %d\n", (intptr_t) local_16 % 16); printf("Local Aligned 32: %d\n", (intptr_t) local_32 % 32); printf("Global Aligned 16: %d\n", (intptr_t) global_16 % 16); printf("Global Aligned 32: %d\n", (intptr_t) global_32 % 32); diff16 = (intptr_t) global_16 / 16.0f; diff32 = (intptr_t) global_32 / 32.0f; printf("16 -> %f - 32 -> %f\n", diff16, diff32); return 0; } still outputs: Local Aligned 16: 0 Local Aligned 32: 0 Global Aligned 16: 0 Global Aligned 32: 16 16 -> 263177.000000 - 32 -> 131587.500000 regardless of the optimizations options
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 This is an example of what I'm talking about: the bar() function is optimized away to simply "return 0" because the compiler assumes the alignment is correct without having to actually emit the code to check it: $ echo 'char foo[1] __attribute__((aligned(16))); int bar() { if((int)foo % 16) return 1; else return 0; }' | i686-pc-mingw32-gcc -x c - -S -o - -fomit-frame-pointer .file "" .text .globl _bar .def _bar; .scl 2; .type 32; .endef _bar: movl $0, %eax ret .comm _foo, 16 # 1
still no success while compiling ffmpeg :( Program received signal SIGSEGV, Segmentation fault. 0x0074fe94 in ff_fft_calc_3dn () (gdb) bt #0 0x0074fe94 in ff_fft_calc_3dn () #1 0x007506f5 in ff_fft_calc_3dn () #2 0x00750755 in ff_fft_calc_3dn () #3 0x00750a55 in ff_fft_dispatch_sse () #4 0x00000400 in ?? () #5 0x0074f718 in ff_fft_calc_sse (s=0x5b14600, z=0x5b0c5f0) at libavcodec/i386/fft_sse.c:35 #6 0x00748080 in ff_mdct_calc (s=0x5b145f0, out=0x5b0c5f0, input=0x5b105f0) at libavcodec/dsputil.h:673 #7 0x0064731d in encode_superframe (avctx=0x5ac4010, buf=0x5ba0030 "", buf_size=524288, data=0x5b5cca0) at libavcodec/wmaenc.c:92 #8 0x00495b16 in avcodec_encode_audio (avctx=0x5ac4010, buf=0x5ba0030 "", buf_size=524288, samples=0x750a50) at libavcodec/utils.c:865 #9 0x00406236 in output_packet (ist=0x5ac4390, ist_index=0, ost_table=0x3fbfb0, nb_ostreams=1, pkt=0x22fed8) at ffmpeg.c:672 #10 0x004095d1 in av_encode (output_files=0xb6c000, nb_output_files=1, input_files=0xb6b280, nb_input_files=1, stream_maps=0xb6c060, nb_stream_maps=0) at ffmpeg.c:2129 #11 0x00409878 in main (argc=Cannot access memory at address 0xa ) at ffmpeg.c:3897 (gdb) p ff_cos_16 $1 = {1, 0.923879504, 0.707106769, 0.382683426, 6.12303177e-017, 0.382683426, 0.707106769, 0.923879504} (gdb) p &ff_cos_16 $2 = (FFTSample (*)[8]) 0xe55c94 asm code: .lcomm _ff_cos_16,32,16 lcomm? nm output: $ nm ffmpeg_g.exe |grep ff_cos_16 00e55c94 B _ff_cos_16 00e185d4 B _ff_cos_16384 not aligned :( gcc version 4.3.3 20080929 (prerelease) (Sherpya) GNU assembler version 2.18.91 (i686-pc-mingw32) using BFD version (GNUBinutils) 2.18.91.20080917
a printf in the code for ff_cos_16 causes the compiler to align the var, but at this point it crashes in another place using sse code
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 > a printf in the code for ff_cos_16 causes the compiler to align the var, > but at this point it crashes in another place using sse code So, does this mean that the align patch is working, but that there is some other bug that is preventing some sse code from working ? If so then I can formally submit the align patch for approval. Cheers Nick
(In reply to comment #24) > Subject: Re: [cygming] Invalid alignment for SSE store > to .comm data generated with -O3 > > > a printf in the code for ff_cos_16 causes the compiler to align the var, > > but at this point it crashes in another place using sse code > > So, does this mean that the align patch is working, but that there is > some other bug that is preventing some sse code from working ? If so > then I can formally submit the align patch for approval. > > Cheers > Nick > unfortunately not, with a printf the var is aligned without a printf is not aligned (someone changes the behavior of the patch), and thus it will crash the patch seems to make the right job but some other tool in the chain interferes with it
(In reply to comment #14) > Hi Guys, > > I am not able to reproduce the build problems that were reported with the > first version of my patch, but then again I do not have a native cygwin build > system or a system in which I can bootstrap mingw32. I think that Brian may > well be right however in that the patch is behaving badly when it manually > switches into the .bss section. I have uploaded a revised version of the patch > which uses the correct gcc function to perform the section switch, so please > can anyone who is interested give this new patch a run. This latter patch, applied to R139853 [*] (just before the big IRA merge) does not cause the build problems of the prior patch but does not solve the problem either. With this patch, using your _align.c test case which I uploaded in previous comment: gcc -DHAVE_ALIGNED_COMMON _align.c && a.exe fails. The assembler code, specifically the .balign 32 directive applied to the _common object looks like it should do the right thing, but doesn't (: .file "_align.c" .bss .balign 1 .comm _a, 1 .balign 32 .comm _common, 64 .globl _b .section .bss,"w" _b: .space 1 .globl _bss .align 32 _bss: .space 64 .globl _c .data _c: .byte 1 .globl _data .align 32 _data: .long 1 .space 4 .long 2 .space 4 .long 3 .space 12 .long 4 .space 28 .text .globl _test .def _test; .scl 2; .type 32; .endef _test: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax addl $8, %eax andl $7, %eax testl %eax, %eax jne L2 movl 8(%ebp), %eax addl $16, %eax andl $15, %eax testl %eax, %eax jne L2 movl 8(%ebp), %eax addl $32, %eax andl $31, %eax testl %eax, %eax je L4 L2: call _abort L4: leave ret .def ___main; .scl 2; .type 32; .endef .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp call ___main movl $_common, (%esp) call _test movl $_bss, (%esp) call _test movl $_data, (%esp) call _test movl $0, (%esp) call _exit .def _abort; .scl 2; .type 32; .endef .def _exit; .scl 2; .type 32; .endef [*] Since the IRA merge, mingw32 has been broken, failing to compile libstdc++ and raising many (>500 in gcc.dg) new testcase errors.
(In reply to comment #13) > Created an attachment (id=16425) [edit] > Revised patch with correct section switching > That patch causes other problems This test case: /* t1.c */ int i[0]; int k; void testi (void) { i[0] = 0; } void testk (void) { k = 0; } int main (void) { return 0; } now fails with t1.c:(.text+0x5): undefined reference to `_i' The assembler code is .file "t1.c" .text .p2align 4,,15 .globl _testi .def _testi; .scl 2; .type 32; .endef _testi: pushl %ebp movl %esp, %ebp movl $0, _i popl %ebp ret .p2align 4,,15 .globl _testk .def _testk; .scl 2; .type 32; .endef _testk: pushl %ebp movl %esp, %ebp movl $0, _k popl %ebp ret .def ___main; .scl 2; .type 32; .endef .p2align 4,,15 .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp andl $-16, %esp call ___main xorl %eax, %eax movl %ebp, %esp popl %ebp ret .bss .balign 4 .comm _i, 0 <<<<< size of i[0] .balign 4 .comm _k, 4 I suspect we need to add this bit, or similar, back in - rounded = size ? size : 1; - rounded += (BIGGEST_ALIGNMENT / BITS_PER_UNIT) - 1; - rounded = (rounded / (BIGGEST_ALIGNMENT / BITS_PER_UNIT) - * (BIGGEST_ALIGNMENT / BITS_PER_UNIT)); and output rounded rather than size. Testing now.
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 Hi Danny, > This test case: > t1.c:(.text+0x5): undefined reference to `_i' Hmm, I cannot reproduce this, however... > .comm _i, 0 <<<<< size of i[0] I did drop the code to handle (size == 0) because I was not sure why it was there. > I suspect we need to add this bit, or similar, back in > - rounded = size ? size : 1; How about this third revision of the patch ? Cheers Nick
Created attachment 16458 [details] Revised patch which handles (size == 0)
the patch looks ok but unfortunately does not always solves the problem, something in the chain misalignes the symbol This does not happen always but in some circumstances :(
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 > the patch looks ok but unfortunately does not always solves the problem, > something in the chain misalignes the symbol > This does not happen always but in some circumstances :( Possibly a linker bug. Could you investigate some more ? For example what alignment does the .bss section have in the object file and in the executable ? Cheers Nick
this archive: http://people.netfarm.it/~sherpya/gcc/info.7z contains ffmpeg_g.exe - non stripped final executable fft.c/o/s - source, object file and asm generated related vars are: ff_cos_16, ff_cos_32 etc $ nm ffmpeg_g.exe |grep ff_cos_16 00e62d14 B _ff_cos_16 00e25654 B _ff_cos_16384 $ nm libavcodec/fft.o |grep ff_cos_16 00000020 C _ff_cos_16 00008000 C _ff_cos_16384 gcc version 4.3.3 20081004 (prerelease) (Sherpya) (plain svn with the patch) GNU ld (GNU Binutils) 2.18.91.20080917 I can also provide all my build env as 7z Regards
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 Hi, > http://people.netfarm.it/~sherpya/gcc/info.7z Just a quick check: If you build with "-fno-common" does the executable then work ? Cheers Nick
$ nm ffmpeg_g.exe |grep ff_cos_16 00dd84e0 B _ff_cos_16 00de04c0 B _ff_cos_16384 except snow and svq1 tests, crashing because of bugs in tree opts on win32 sse code is working fine
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 Hi sherpya, > ------- Comment #34 from sherpya at netfarm dot it 2008-10-06 14:13 ------- > $ nm ffmpeg_g.exe |grep ff_cos_16 > 00dd84e0 B _ff_cos_16 > 00de04c0 B _ff_cos_16384 > > except snow and svq1 tests, crashing because of bugs in tree opts on win32 > sse code is working fine As I suspected. The PE/COFF file format does not provide for specifying the alignment of commons. Hmm, I wonder if gcc should complain if it finds aligned commons with COFF backends or if we should try to generate a GNU extension to the COFF format. Cheers Nick
so how with -fno-common can make aligned work?
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 Hi, > so how with -fno-common can make aligned work? Hang on - I thought that you had said that when ffmpeg_g.exe was built with -fno-common that it worked, modulo some (completely separate) tree optimization bugs. Is that not right ?
yes alignment works, and even my test align program with 4.2 without patches gives correct alignment to local and global symbols Local Aligned 16: 0 Local Aligned 32: 0 Global Aligned 16: 0 Global Aligned 32: 0 16 -> 265986.000000 - 32 -> 132994.000000
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 > yes alignment works, and even my test align program with 4.2 without patches > gives correct alignment to local and global symbols OK, so when you said: "so how with -fno-common can make aligned work?" did you really mean: "so how without -fno-common can make aligned work?" In which case the answer is - I do not know. The problem is that the PE/COFF file format does not support aligned commons. We could try to create an extension to the format to support them, but that would be non-standard. Another possibility would be to turn any common symbol with an alignment attribute into a weak symbol instead. This would work (I think, I have not tried it), provided that there are no other definitions of the same symbol with a different size. Which would possibly cause problems with FORTRAN programs but it is unlikely to be an issue with C/C++ programs. Another possibility is to modify the linker so that when it is placing common symbols into the .bss section of the output executable every symbol is aligned to the maximum alignment specified for any of the .bss sections of any input object file. This would probably waste huge amounts of space in the .bss section (for large programs anyway) but it ought to work. Cheers Nick
I mean that with -fno-common alignment works, even with non patched 4.2, my question is due to the fact that it's not so clear for me what no-common does and adding -fno-common what are side effects? do using something like dummy or nops in bss section does something wrong? at this point a forced alignment to 16 wouldn't be such a problem for the space wasted, at least this can avoid problems with sse code (16 bytes) and 3dnow (8 bytes) gcc may need to disable sse code on mingw/cygwin or at least avoid implicit vectorizations
(In reply to comment #35) > As I suspected. The PE/COFF file format does not provide for specifying > the alignment of commons. > > Hmm, I wonder if gcc should complain if it finds aligned commons with > COFF backends or if we should try to generate a GNU extension to the > COFF format. > Aligned commons are not part of the PE COFF spec and AFAICT neither MASM nor NASM provide a way to specify aligned commons. I am afraid that a GNU extension will cause object incompatibility, so it would it need to be a non-default option. We already have -fno-common __attribute__((no_common)) #pragma gcc optimize (no_common) G++ already puts commons in .bss Perhaps a new option -fno_large_common that applies -fcommon only to objects with align <= 4 bytes? A warning would be useful in any case. Danny
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 When you are comparing gcc 4.2 to current trunk, are you keeping the linker (binutils) version the same? As mentioned in comment 6, the linker behavior changed recently which results in a different default alignment of the .bss segment which could explain the difference. IMHO, we should just have gcc automatically enable -fno-common on PE if SSE is enabled. That's what the MS tools do, AFAICT.
binutils 2.18.91.20080917 on both there are changes for the local alignment in the gas code but gcc does not use them without my attached gcc_align_fix.diff patch (not sure 100%) newer binutils are not working well on mingw
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 sherpya at netfarm dot it wrote: > I mean that with -fno-common alignment works, even with non patched 4.2, my > question is due to the fact that it's not so clear for me what no-common does -fno-common stops uninitialized variables declared in C and C++ programs from being treated in the same way as common variables declared in FORTRAN programs. > and adding -fno-common what are side effects? Essentially there are two side effects: The first is that you will get a link time error if you declare the same uninitialized variable twice in two different source files, without using the 'extern' keyword on one of them. eg: % cat foo.c int a; % cat bar.c int a; int main (void) { return 0; } % gcc foo.c bar.c % gcc -fno-common foo.c bar.c multiple definition of `a' This is often a problem with badly written header files which declare variables without using 'extern'. eg: % cat header.h int a; % cat foo.c #include "header.h" int a; % cat bar.c #include "header.h" int main (void) { return 0; } % gcc -fno-common foo.c bar.c multiple definition of `a' The other side-effect, and the one that is more interesting for our purposes, is that it forces uninitialised variables to be placed into the .bss section. This is important because symbols in the PE/COFF file format do not have an alignment attribute of their own. Instead the alignment is inherited by the containing section, with the maximum alignment of any symbol inside a section being taken as the section's alignment as a whole. Symbols are placed inside the section on suitably aligned boundaries, so that providing that the section itself is placed on an alignment boundary everything will work. eg: % cat foo.c int normal_align; int aligned_16 __attribute__((aligned(16))); % gcc -fno-common -c foo.c % objdump --syms foo.o [ 8](sec 3)(fl 0x00)(ty 0)(scl 2)(nx 0) 0x00000000 _normal_align [ 9](sec 3)(fl 0x00)(ty 0)(scl 2)(nx 0) 0x00000010 _aligned_16 Note how the 'aligned_16' variable starts at an offset of 00000010 from the start of section 3, whereas 'normal_align' starts at an offset of 00000000. Ie there is a gap of 12 bytes from offset 00000004 to 0000000f. % objdump -h foo.o Idx Name Size VMA LMA File off Algn 2 .bss 00000020 00000000 00000000 00000000 2**4 Note that the .bss section has been given an alignment of 2^4. This is because it contains 'aligned_16'. If that variable had not been declared then the .bss section would have been given its default alignment of 2^2. Also note that section numbering differs between the two uses of objdump. Ie "(sec 3)" in the "objdump --syms" output refers to the third declared section which is the section with an index of 2 in the "objdump -h" output. % cat bar.c int a; % gcc -fno-common bar.c foo.o % nm a.exe 00402000 B _a 00402020 B _aligned_16 00402010 B _normal_align So after linking 'aligned_16' still has a 16-bit alignment because of the 2^4 alignment of the .bss section in the foo.o object file. The reason that all of this is important is that when common variables are stored in a PE/COFF object file they are not assigned to any section. Since only sections, not symbols, have an alignment attribute in PE/COFF object files, any alignment requirements of common symbols are lost. This is what has been causing the problems that you have experienced. eg: % cat foo.c int normal_align; int aligned_16 __attribute__((aligned(16))); % gcc -c foo.c % objdump --syms foo.o [ 8](sec 0)(fl 0x00)(ty 0)(scl 2)(nx 0) 0x00000004 _normal_align [ 9](sec 0)(fl 0x00)(ty 0)(scl 2)(nx 0) 0x00000004 _aligned_16 Note how the variables are assigned to "(sec 0)" which does not exist and that there is no field or flag specifying the alignment for either of them. You may ask why common variables are not assigned to the .bss section. The reason is that if there are multiple declarations of the same variable and all but one of which are common, then the non-common declaration takes precedence. eg: % cat foo.c int a; % cat bar.c int a = 1; % gcc foo.c bar.c % nm a.exe 00402000 D _a Ie the 'a' variable has been placed in the .data section and not the .bss section, despite the fact that it was declared uninitialised in foo.c. So common variables are not assigned to a section until the final link takes places. If there are no non-common definitions of a variable to specify where they should be placed then they are assigned to the .bss section, but by then it is too late - the alignment requirements of the symbol have been lost. Cheers Nick
Created attachment 16475 [details] Enable -fno-common with -msse for Cygwin/Mingw
Hi Brian, > IMHO, we should just have gcc automatically enable -fno-common on PE if > SSE is enabled. That's what the MS tools do, AFAICT. Something like the newly uploaded patch ? Cheers Nick
ffmpeg uses aligned vars inside an object from an external nasm/yasm compiled module, so it's very unlikely gcc can be aware of this, the patch should solve "implicit" sse code generation, but I think there are no solution for external references. Probably the best way is to pass -fno-common when there is sse code in the project I'll test your patch for the first post of the bugreport, and I'll test also ffmpeg but I'm almost sure that this TARGET_SSE is not true in this specific case with -fno-common I had no problems with make test (when using sse code) all of the stuff in this bugreport is not really definitive, we have a patch for local and for global using commons but there is no way to propagate this data in pe/coff object why not adding an extensions for intermediate coffs? there isn't space anywhere in coff files? yes one can use -fno-common or the fix to avoid crashes when gcc vectorize but it does not cover all cases
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 sherpya at netfarm dot it wrote: > I'll test your patch for the first post of the bugreport, and I'll test also > ffmpeg but I'm almost sure that this TARGET_SSE is not true in this specific > case You said you used -march=core2 to compile. This implicitly includes -msse2 (and -msse) so yes it will be set. > yes one can use -fno-common or the fix to avoid crashes when gcc vectorize > but it does not cover all cases What cases would it not cover? When placing items directly in bss or data the required alignment can be unambiguously conveyed to the assembler so I don't see how it would get lost.
not exactly, Simon Sasburg compiled with -march=core2 I'm not explicitly telling to gcc to compile sse code, arch is i686 and opt is -O2 so there is no sse code generated by gcc, ffmpeg declares aligned vars in fft.c and uses them in asm sse code, file fft_mmx.asm that is generated by yasm so gcc is unable to known that those vars will be used by sse code, this case will not covered by the last patch gcc should emit a warning when compiling align declared vars on win32 version 3.x had a similar warning but only if using align > 16 with current state of compiler/toolchain even 16-bytes aligned is not possible without explicit -fno-common compile switch
Subject: Re: [cygming] Invalid alignment for SSE store to .comm data generated with -O3 Oh, I see what you mean now. Yeah, predicating it on just TARGET_SSE isn't sufficient. I'm starting to think the idea of a PE/COFF-specific convention to record the alignment isn't such a crazy idea. It could just be a section with a specific name e.g. .gnu.note.aligncomm. The PE assembler could then accept the 3-argument form of .comm and store the info there, and the linker would be able to consume it. When linking with a non-GNU linker the section would just be silently ignored and the alignment info discarded, which isn't any worse than the current situation where it's also discarded, so you'd just be back to requiring -fno-common. Of course that's a much more ambitious change.
*** Bug 39039 has been marked as a duplicate of this bug. ***
*** Bug 39194 has been marked as a duplicate of this bug. ***
This thread is alos relevant. http://gcc.gnu.org/ml/gcc/2009-04/msg00462.html Adding Dave Korn to cc:
I've started work on the binutils support for this. Work-in-progress patch at http://sourceware.org/ml/binutils/2009-05/msg00228.html Once that's complete, I could deal with the GCC end too. What should we do about backward-compatibility? If we attempt to use the new features with the old toolchain, it won't work, and the linker will issue a bunch of noisy warnings about the .drectve statements it doesn't understand. Should use of the new feature depend on a -m flag, or an assembler/linker version check of some sort? Or should we just go ahead and let users of old toolchains get a bunch of warnings? On the same lines, should we still continue to pad all COMMON symbols to a round multiple of BIGGEST_ALIGNMENT, or should we get rid of that when we're using the new feature?
(In reply to comment #54) > I've started work on the binutils support for this. Work-in-progress patch at > http://sourceware.org/ml/binutils/2009-05/msg00228.html > > Once that's complete, I could deal with the GCC end too. > > What should we do about backward-compatibility? If we attempt to use the new > features with the old toolchain, it won't work, and the linker will issue a > bunch of noisy warnings about the .drectve statements it doesn't understand. > > Should use of the new feature depend on a -m flag, or an assembler/linker > version check of some sort? Or should we just go ahead and let users of old > toolchains get a bunch of warnings? On the same lines, should we still > continue to pad all COMMON symbols to a round multiple of BIGGEST_ALIGNMENT, or > should we get rid of that when we're using the new feature? Perhaps you could activate -mno-common if the new feature you are implementing in your binutils patch is not detected by configure?
Created attachment 17895 [details] Target option and autoconf test to enable aligned common support. Currently putting the attached through a bootstrap and test cycle. This adds a configure time GAS feature check for the 3-operand form aligned .comm pseudo-op, and a target-specific command-line switch -mpe-aligned-commons which is enabled or disabled by default depending on the status returned from the configure check. It doesn't yet address disabling use of commons when no alignment support is available. That can be done as a follow-on patch.
Bah. In case anyone else was about to point this out to me, +gcc_GAS_CHECK_FEATURE([.comm with alignment], gcc_cv_as_comm_has_align, + [2,19,52],, + [.comm foo,1,32],, +[AC_DEFINE(HAVE_GAS_ALIGNED_COMM, 1, + [Define if your assembler supports specifying the alignment + of objects allocated using the GAS .comm command.])]) ... that won't work with ... +mpe-aligned-commons +Target Var(use_pe_aligned_common) Init(HAVE_GAS_ALIGNED_COMM) ... that, because it's undefined, not zero, when the feature isn't detected. I'll have to respin it.
Created attachment 17906 [details] Revised patch Revised version of the patch that defines the autoconf feature test macro to 0/1 rather than to undefined/defined, so it can be used to init the target option variable correctly even when gas during configure does not support 3-option .comm format. Nearly completed bootstrap now, will post to the -patches list (minus the generated file diffs) later when testing underway.
Created attachment 17909 [details] D'oh. Quick respin. Huh. Alignment is passed to the backend as number of bits, but of course the assembler wants it as the power of 2 of the number of bytes to align it to. Fortunately I spotted this in manual testing before kicking off the 48-hr testrun, so I'm restrapping the patch with one tiny change to winnt.c before I do.
Subject: Bug 37216 Author: davek Date: Thu May 28 10:48:35 2009 New Revision: 147950 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147950 Log: gcc/ChangeLog: 2009-05-28 Dave Korn <dave.korn.cygwin@gmail.com> PR target/37216 * configure.ac (HAVE_GAS_ALIGNED_COMM): Add autoconf test and macro definition for support of three-operand format aligned .comm directive in assembler on cygwin/pe/mingw target OS. * configure: Regenerate. * config.in: Regenerate. * config/i386/winnt.c (i386_pe_asm_output_aligned_decl_common): Use aligned form of .comm directive if -mpe-aligned-commons is in effect. * config/i386/cygming.opt (-mpe-aligned-commons): Add new option. * doc/invoke.texi (-mpe-aligned-commons): Document new target option. * doc/tm.texi (ASM_OUTPUT_COMMON): Document zero size commons. gcc/testsuite/ChangeLog: 2009-05-28 Dave Korn <dave.korn.cygwin@gmail.com> Uros Bizjak <ubizjak@gmail.com> Danny Smith <dansmister@gmail.com> PR target/37216 * lib/target-supports.exp (check_effective_target_pe_aligned_commons): New function. * gcc.target/i386/pr37216.c: New test source file. * gcc.dg/compat/struct-layout-1_generate.c (dg_options[]): No longer use -fno-common for testing Cygwin and MinGW targets. Added: trunk/gcc/testsuite/gcc.target/i386/pr37216.c Modified: trunk/gcc/ChangeLog trunk/gcc/config.in trunk/gcc/config/i386/cygming.opt trunk/gcc/config/i386/winnt.c trunk/gcc/configure trunk/gcc/configure.ac trunk/gcc/doc/invoke.texi trunk/gcc/doc/tm.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_generate.c trunk/gcc/testsuite/lib/target-supports.exp
Fixed for 4.5.0.
*** Bug 40333 has been marked as a duplicate of this bug. ***
GCC still generates a segfaulting executable when used with the testcase in the report, most likely because my assembler doesn't support the 3-argument .comm directive. When using the '-mpe-aligned-commons' i get an error about this: > Error: junk at end of line, first unrecognized character is `,'
(In reply to comment #63) > GCC still generates a segfaulting executable when used with the testcase in the > report, most likely because my assembler doesn't support the 3-argument .comm > directive. > > When using the '-mpe-aligned-commons' i get an error about this: > > Error: junk at end of line, first unrecognized character is `,' > Of course. What were you hoping would happen?
Indeed, i should have expected this, and after rereading the comments here you even mentioned this problem already. Sorry for the noise.
(In reply to comment #65) > Indeed, i should have expected this, and after rereading the comments here you > even mentioned this problem already. Sorry for the noise. > That's OK. GCC attempts to detect at configure time if the installed toolchain supports the feature, and will configure itself to generate or not the 3-operand form according to the results of that test. The command-line option is there if you ever need to override it, but that'll only be the case if you're mixing-and-matching toolsets or doing some other kind of complex trickery that involves using GCC with a different binutils from that it was configured for.
Will the fix for this bug be backported to the 4.4 branch?
(In reply to comment #67) > Will the fix for this bug be backported to the 4.4 branch? > I wasn't planning to do so myself, in the near future at any rate. All my personal time is going into getting everything fixed for 4.5.0. Maybe after that's working well I will have time to backport it. Of course, anyone else who has the time available is free to do that themselves and submit it to the -patches list.
*** Bug 39369 has been marked as a duplicate of this bug. ***
*** Bug 41950 has been marked as a duplicate of this bug. ***
*** Bug 45036 has been marked as a duplicate of this bug. ***