Bug 44129

Summary: Building linux kernel with gcc-4.5.0 and CONFIG_CC_OPTIMIZE_FOR_SIZE segfaults
Product: gcc Reporter: Bruce Dubbs <bdubbs>
Component: targetAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED INVALID    
Severity: normal CC: andi-gcc, gcc-bugs, hpa, kevin.bowling, kjwinchester, mike, rhill, Thomas8675309
Priority: P3    
Version: 4.5.0   
Target Milestone: ---   
Host: x86_64-unknown-linux-gnu Target: x86_64-unknown-linux-gnu
Build: x86_64-unknown-linux-gnu Known to work:
Known to fail: Last reconfirmed:
Attachments: Linux kernel configuration that fails with gcc-4.5

Description Bruce Dubbs 2010-05-14 02:39:02 UTC
I believe there is an optimization bug in gcc-4.5.0.  When building with gcc-4.5.0 and setting the linux kernel flag CONFIG_CC_OPTIMIZE_FOR_SIZE, the kernel indicates a segfault upon boot.  

Tested with the normal sysvinit and bash-static and the indication is the identical memory address with the error "kernel panic - not syncing: Attempted to kill init!" 

The error indications are almost identical to the post at:

http://www.gossamer-threads.com/lists/linux/kernel/1210031 

Clearing the optimization flag boots normally.  Using gcc-4.4.3 does not show the problem.

Tested on several kernels:  2.6.32.8, 2.6.33.4, 2.6.34-rc7.
Comment 1 Richard Biener 2010-05-14 12:26:44 UTC
Waiting for a testcase.  And for the reporter to try the tip of the 4.5 branch.
Comment 2 Bruce Dubbs 2010-05-14 21:27:31 UTC
OK, these are my procedures:

svn co svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch gcc45-svn
(revision 159398)

cd gcc45-svn
sed -i 's/install_to_$(INSTALL_DEST) //' libiberty/Makefile.in 
sed -i 's@\./fixinc\.sh@-c true@' gcc/Makefile.in
mkdir ../gcc-build 
cd    ../gcc-build 

../gcc45-svn/configure \
    --prefix=/usr \
    --libexecdir=/usr/lib \
    --enable-shared \
    --enable-threads=posix \
    --enable-__cxa_atexit \
    --disable-multilib \
    --enable-bootstrap \
    --enable-clocale=gnu \
    --enable-languages=c,c++

make bootstrap 
make -k check  

../gcc45-svn/contrib/test_summary

Native configuration is x86_64-unknown-linux-gnu

                === g++ tests ===


Running target unix

                === g++ Summary ===

# of expected passes            21906
# of expected failures          149
# of unsupported tests          269

                === gcc tests ===


Running target unix
FAIL: gcc.c-torture/compile/limits-exprparen.c  -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/limits-exprparen.c  -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/limits-exprparen.c  -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/limits-exprparen.c  -O3 -fomit-frame-pointer  (test for excess errors)
FAIL: gcc.c-torture/compile/limits-exprparen.c  -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/limits-exprparen.c  -Os  (test for excess errors)

                === gcc Summary ===

# of expected passes            61141
# of unexpected failures        6
# of expected failures          168
# of unsupported tests          826

Running target unix

                === libgomp Summary ===

# of expected passes            1029

                === libmudflap tests ===


Running target unix
FAIL: libmudflap.c/fail31-frag.c (-O3) output pattern test
FAIL: libmudflap.c/pass45-frag.c (-O3) execution test
FAIL: libmudflap.c/pass45-frag.c (-O3) output pattern test
FAIL: libmudflap.c/pass45-frag.c (-O3) execution test
FAIL: libmudflap.c/pass45-frag.c (-O3) output pattern test
FAIL: libmudflap.c++/pass41-frag.cxx execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
FAIL: libmudflap.c++/pass41-frag.cxx ( -O) execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test

                === libmudflap Summary ===

# of expected passes            1884
# of unexpected failures        10
                === libstdc++ tests ===


Running target unix

                === libstdc++ Summary ===

# of expected passes            7065
# of expected failures          95
# of unsupported tests          339

make install

gcc --version
gcc (GCC) 4.5.1 20100514 (prerelease)

cd /sources/linux-2.6.33.4

make menuconfig
# KBUILD_CFLAGS   += -Os

make
make modules_install

cp arch/x86/boot/bzImage /boot/linux-test

reboot

Hand Copied:

init[1] segfault at ffffffff810099bd ip ffffffff810088bd sp
--------- error 15
kernel panic - not syncing: Attempted to kill init!
Pid: 1,comm: init not tainted 2.6.33.4-lfs66
 Call Trace:
 [<ffffffff81------>] panic
 [<ffffffff81------>] ? get_current_tty
 [<ffffffff81------>] do_exit
 [<ffffffff81------>] do_group_exit
 [<ffffffff81------>] get_signal_to_deliver
 [<ffffffff81------>] do_signal
 [<ffffffff81------>] ? printk
 [<ffffffff81------>] ? rdtsc_barrier
 [<ffffffff81------>] ? printk
 [<ffffffff81------>] ? __bad_area_nosemaphore
 [<ffffffff81------>] ? rdtsc_barrier
 [<ffffffff81------>] do_notify_resume
 [<ffffffff81------>] retint_signal
 [<ffffffff81------>] ? rdtsc_barrier
---------

Changing back to not optimize for size results in a bootable kernel.

Also, every package on the boot partition was built with gcc-4.5.0.  Evidently none of the other core packages tries to optimize for size.


 

Comment 3 H.J. Lu 2010-05-14 22:24:43 UTC
There are some known issues with gcc 4.5.0 and
Linux kernel. Please try gcc 4.5.1 snapshot from

ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20100513/

or the mirror sites.
Comment 4 H.J. Lu 2010-05-14 22:27:25 UTC
(In reply to comment #3)
> There are some known issues with gcc 4.5.0 and
> Linux kernel. Please try gcc 4.5.1 snapshot from
> 
> ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20100513/
> 
> or the mirror sites.
> 

I saw you have tried revision 159398. I guess you
need to find which file is miscompiled.

BTW, you aren't using gold, are you?
Comment 5 Bruce Dubbs 2010-05-14 22:50:44 UTC
(In reply to comment #4)

> I saw you have tried revision 159398. I guess you
> need to find which file is miscompiled.

I have no idea how to do that for the kernel.
 
> BTW, you aren't using gold, are you?

I don't know what gold is either.

I'm willing to help find the problem, but I need some hints on how to proceed.
Comment 6 H.J. Lu 2010-05-15 01:13:06 UTC
(In reply to comment #5)
> (In reply to comment #4)
> 
> > I saw you have tried revision 159398. I guess you
> > need to find which file is miscompiled.
> 
> I have no idea how to do that for the kernel.

You build 2 kernel trees, one with gcc 4.5.1 and one
with gcc 4.4.4. You copy binaries from one tree to
another and rebuild kernel one file at a time until
you find the single file which was miscompiled.

> 
> > BTW, you aren't using gold, are you?
> 
> I don't know what gold is either.
> 

Please show the output of

# ld -V

Comment 7 Bruce Dubbs 2010-05-15 04:06:18 UTC
(In reply to comment #6)

> You build 2 kernel trees, one with gcc 4.5.1 and one
> with gcc 4.4.4. You copy binaries from one tree to
> another and rebuild kernel one file at a time until
> you find the single file which was miscompiled.

OK.  That will take some time.  I'll get back to you when I find the problem file.
 
> > > BTW, you aren't using gold, are you?
> > 
> > I don't know what gold is either.
 
> Please show the output of
> # ld -V

 GNU ld (GNU Binutils) 2.20.1.20100303
  Supported emulations:
   elf_x86_64
   elf_i386
   i386linux
   elf_l1om
Comment 8 Bruce Dubbs 2010-05-16 05:55:11 UTC
Created attachment 20671 [details]
Linux kernel configuration that fails with gcc-4.5
Comment 9 Bruce Dubbs 2010-05-16 06:38:38 UTC
I have traced the problem file for this bug to the kernel file

arch/x86/kernel/tsc.c

I have two source trees for the 2.6.33.4 kernel, one compiled with gcc-4.4.1 which works and gcc version 4.5.1 20100514 (prerelease) which fails.  I have attached the config file that generates the error.

I traced the problem to the above file, but don't know enough about the kernel and gcc internals to either parse it down much further or fix the problem.  When I copy arch/x86/kernel/tsc.o from the 4.4.1 build to the 4.5.1 kernel tree and rebuild, the system (relatively generic Dell x86_64) boots properly.

The file build command is:

gcc -Wp,-MD,arch/x86/kernel/.tsc.o.d  -nostdinc -isystem /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/include -I/sources/linux-2.6.33.4-gcc45/arch/x86/include -Iinclude  -include include/generated/autoconf.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -Os -m64 -march=core2 -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=2048 -fno-stack-protector -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fno-dwarf2-cfi-asm -fconserve-stack -fno-stack-protector   -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(tsc)"  -D"KBUILD_MODNAME=KBUILD_STR(tsc)"  -c -o arch/x86/kernel/tsc.o arch/x86/kernel/tsc.c

When I tried removing the -Os option, it did not fix the problem.

I'm guessing the problem is around lines 724-761 because of the ifdef CONFIG_X86_64 and the use of rdtsc_barrier() in the panic info above.

I'm really at the limits of what I know how to do here.  I think I rebuilt/rebooted a hundred times to narrow things down so far.  Let me know if I can do anything else.

Many Bothans died to bring us this information.
Comment 10 Kevin Bowling 2010-05-19 08:33:22 UTC
Seeing Bruce's symptoms here, no relation to -Os vs -O2.  gcc-4.5.0 on Gentoo ~amd64.  
Comment 11 Bruce Dubbs 2010-05-24 06:32:36 UTC
Updated to gcc (GCC) 4.5.1 20100524 (prerelease) but still have the problem.

There is something about -Os that triggers the kernel panic in arch/x86/kernel/tsc.c

I tried to disable all -O2 options after -Os and the kernel still fails.

gcc -Wp,-MD,arch/x86/kernel/.tsc.o.d  -nostdinc -isystem \
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/include \
-I/sources/linux-2.6.33.4-gcc45/arch/x86/include -Iinclude  -include \
include/generated/autoconf.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes \
-Wno-trigraphs -fno-strict-aliasing -fno-common \
-Werror-implicit-function-declaration -Wno-format-security \
-fno-delete-null-pointer-checks -Os \
-fno-thread-jumps \
-fno-caller-saves \
-fno-crossjumping \
-fno-cse-follow-jumps  \
-fno-cse-skip-blocks \
-fno-delete-null-pointer-checks \
-fno-expensive-optimizations \
-fno-gcse  \
-fno-gcse-lm  \
-fno-inline-small-functions \
-fno-indirect-inlining \
-fno-ipa-sra \
-fno-optimize-sibling-calls \
-fno-peephole2 \
-fno-regmove \
-fno-rerun-cse-after-loop  \
-fno-sched-interblock  \
-fno-sched-spec \
-fno-schedule-insns  \
-fno-schedule-insns2 \
-fno-strict-overflow \
-fno-tree-switch-conversion \
-fno-tree-pre \
-fno-tree-vrp \
-fno-align-functions  \
-fno-align-jumps \
-fno-align-loops  \
-fno-align-labels \
-fno-reorder-blocks  \
-fno-strict-aliasing \
-fno-reorder-blocks \
-m64 -march=core2 -mno-red-zone \
-mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 \
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare \
-fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow \
-Wframe-larger-than=2048 -fno-stack-protector -fno-omit-frame-pointer \
-fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign \
-fno-strict-overflow -fno-dwarf2-cfi-asm -fconserve-stack -fno-stack-protector \
 -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(tsc)" \
 -D"KBUILD_MODNAME=KBUILD_STR(tsc)"  -c -o arch/x86/kernel/tsc.o \
 arch/x86/kernel/tsc.c

Changing -Os to -O2 does *not* fail, even with the -fno options in the single column above removed.
Comment 12 H. Peter Anvin 2010-05-27 01:23:48 UTC
I'm assuming this is current Linus git (post 2.6.34).

For the current merge window we merged a single instance of using the new "asm goto" feature when compiling on gcc 4.5+; this is in fact exactly in the TSC code, in the form of the new construct static_cpu_has() defined in arch/x86/include/asm/cpufeature.h.

It's somewhat curious what is happening with the -Os build there.  All the relevant subfunctions are annotated must_inline.

Comment 13 itosre 2010-05-30 15:09:06 UTC
(In reply to comment #12)
> I'm assuming this is current Linus git (post 2.6.34).

I'm guessing from  

>Tested on several kernels:  2.6.32.8, 2.6.33.4, 2.6.34-rc7

That this isn't the case. Strangely enough, it WFT with 2.6.33* but not with 2.6.34 and 2.6.34-git*.

Even without Os (in which case 2.6.34 boots), Make fails to xmalloc on a 2.6.34+ kernel and segfaults, perhaps from compiling libc with 4.5.0? Perhaps another bug entirely.
Comment 14 Bruce Dubbs 2010-05-31 00:41:43 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > I'm assuming this is current Linus git (post 2.6.34).
> 
> I'm guessing from  
> 
> >Tested on several kernels:  2.6.32.8, 2.6.33.4, 2.6.34-rc7
> 
> That this isn't the case. 

Exactly right.  I'm using 2.6.33.4 for consistency.

I did look at arch/x86/kernel/tsc.c and modified it to remove the lone goto, but the kernel panic upon boot persists.

This problem does seem to be specific to the combination of building on an x86_64 and using -Os
Comment 15 H. Peter Anvin 2010-05-31 01:04:40 UTC
OK, thanks for confirming that it is not related to asm goto.
Comment 16 Andi Kleen 2010-06-18 21:11:48 UTC
This turned out to be a kernel bug, rdtsc_barrier() needed to be marked
__force_inline, otherwise gcc would not inline this function.

(although it's slightly fishy for gcc too not inline a function that
only has two inline assembler statements even with -Os)

http://lkml.org/lkml/2010/6/18/317
Comment 17 Bruce Dubbs 2010-06-19 00:05:00 UTC
I can confirm that changing inline to __always_inline in arch/x86/include/asm/system.h fixed the panic for me.

I'm not sure if this fix is the result of an error in the kernel or gcc.
Leaving the bug open for now, but it may be appropriate to close it. 
Comment 18 Kevin Bowling 2010-06-19 00:07:32 UTC
Does it make sense for the default and distros to use an -Os kernel with modern systems (i.e. 2M-8M cache).  If gcc 4.5 won't inline this at -Os, I wonder what other bad decisions are being made.
Comment 19 Andi Kleen 2010-06-19 07:39:34 UTC
Technically I would say it was a kernel bug.

gcc can't really know how many instructions are there inside inline asm
so it's probably very conservative in its estimation.

So I would recommend to close it. Possibly another bug could be opened
for better estimation of inline asm instructions, but that's really
a separate problem.
Comment 20 Richard Biener 2010-06-19 10:37:03 UTC
GCC improved estimation of asm size from all-asms-are-size-1 we did in 4.4
to count the number of lines estimate that is also used by RTL optimizers.
Comment 21 Andi Kleen 2010-06-19 12:23:13 UTC
This is the inline (after preprocessor) 

I guess the many asm meta commands confuse the heuristic. Maybe it could
be fixed to ignore such commands.

static inline void rdtsc_barrier(void)
{
 asm volatile ("661:\n\t" ".byte 0x66,0x66,0x90\n" "\n662:\n" ".section .altinstructions,\"a\"\n" " " ".balign 8" " " "\n" " " ".quad" " " "661b\n" " " ".quad" " " "663f\n" "   .byte " "(3*32+17)" "\n" "      .byte 662b-661b\n" "    .byte 664f-663f\n" "    .byte 0xff + (664f-663f) - (662b-661b)\n" ".previous\n" ".section .altinstr_replacement, \"ax\"\n" "663:\n\t" "mfence" "\n664:\n" ".previous" : : : "memory");
 asm volatile ("661:\n\t" ".byte 0x66,0x66,0x90\n" "\n662:\n" ".section .altinstructions,\"a\"\n" " " ".balign 8" " " "\n" " " ".quad" " " "661b\n" " " ".quad" " " "663f\n" "   .byte " "(3*32+18)" "\n" "      .byte 662b-661b\n" "    .byte 664f-663f\n" "    .byte 0xff + (664f-663f) - (662b-661b)\n" ".previous\n" ".section .altinstr_replacement, \"ax\"\n" "663:\n\t" "lfence" "\n664:\n" ".previous" : : : "memory");
}
Comment 22 Richard Biener 2010-06-19 12:58:58 UTC
(In reply to comment #21)
> This is the inline (after preprocessor) 
> 
> I guess the many asm meta commands confuse the heuristic. Maybe it could
> be fixed to ignore such commands.
> 
> static inline void rdtsc_barrier(void)
> {
>  asm volatile ("661:\n\t"
 ".byte 0x66,0x66,0x90\n"
 "\n
662:\n"
 ".section .altinstructions,\"a\"\n"
 " " ".balign 8" " " "\n"
" " ".quad" " " "661b\n"
 " " ".quad" " " "663f\n"
 "   .byte " "(3*32+17)" "\n"
 "      .byte 662b-661b\n"
 "    .byte 664f-663f\n"
 "    .byte 0xff + (664f-663f) - (662b-661b)\n"
> ".previous\n"
 ".section .altinstr_replacement, \"ax\"\n"
 "663:\n
\t" "mfence" "\n664:\n"
 ".previous" : : : "memory");

that's 16 lines alone and the call stmt removal doens't result in so
much code-saving that the code size would not increase (which is what
-Os is about - do _not_ increase code-size, not do increase it only
a little).  If I read the above asm correctly it will result in a
single instruction?  We could add support for annotating asm()s
with a size, though that's probably just another source of possible
errors.

The size estimation used is final.c:asm_str_count(), it is currently
not architecture specific.

It could be improved to disregard vertical space and label-only
lines (though even that is tricky to do with respect to the various
assemblers we try to support).
Comment 23 Andi Kleen 2010-06-19 13:13:13 UTC
It's two instructions with some metadata that controls patching these
instructions depending on the CPU capabilities.

Detecting that for gcc would be likely hard.

What would have also prevented this problem would have been a way
to say "any call to another section (as defined by attribute((section)) 
is an error") 

The -Os heuristics are a general problem for the kernel, it really would like to have -Os without the really bad bits (more like a -Omostly-small)

That's one of the reasons I would like to have the better ways
to annotate for hot/cold that I suggested recently.