Summary: | [4.1 Regression] gcc-4.1 compiled ppc64 kernels do not boot | ||
---|---|---|---|
Product: | gcc | Reporter: | olh |
Component: | c | Assignee: | Andrew Pinski <pinskia> |
Status: | RESOLVED FIXED | ||
Severity: | critical | CC: | bergner, dje, gcc-bugs, hp, ian, pinskia, rguenth |
Priority: | P3 | Keywords: | alias, patch, wrong-code |
Version: | 4.1.0 | ||
Target Milestone: | 4.1.0 | ||
URL: | http://gcc.gnu.org/ml/gcc-patches/2005-11/msg00353.html | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2005-11-04 05:48:27 | |
Attachments: |
PR24644.tar.bz2
PR24644-2.tar.bz2 PR24644-3.tar.bz2 PR24644-4.tar.bz2 Patch to try |
Description
olh
2005-11-02 20:40:49 UTC
Created attachment 10115 [details] PR24644.tar.bz2 buildscripts, preprocessed files, kernel .config and debug patches, bootlog. building with -O1 does not fix it. Until we have a testcase that actually shows some evidence that the toolchain is at fault, there is nothing for us to investigate; it could equally well be that the kernel has a bug. copying the schedule.o from gcc41 tree to gcc40 tree doesnt help. What I have found so far with even more debugging: <4>Processor 1 found. <7>schedule(2883) swapper(0):c1,j4294892318 c0000000142ae100 <6>Brought up 2 CPUs <7>_spin_lock_irq(85) swapper(0):c1,j4294892318 l c0000000142ae100 c c0000000004db98c <7>schedule(2996) swapper(0):c1,j4294892318 p c00000000ffd7040 n c00000000ffd2040 r c0000000142ae100 <7>schedule(3002) swapper(0):c1,j4294892318 p c00000000ffd7040 n c00000000ffd2040 r c0000000142ae100 <7>schedule(3029) swapper(1):c0,j4294892318 p c00000000ffd7040 n c00000000ffd7810 r c0000000142a6100 t c0000000142a6100 <7>finish_task_switch(1539) swapper(1):c0,j4294892318 r c0000000142a6100 p c00000000ffd7040 <7>finish_lock_switch(297) swapper(1):c0,j4294892318 <7>_spin_unlock_irq(292) migration/1(5):c1,j4294892318 l c0000000142a6100 c c0000000004dbd24 <4>BUG: spinlock already unlocked on CPU#1, migration/1/5 schedule() line 2883 grabs the rq at c0000000142ae100 on cpu1, and calls spinlock. later in line 3029, it switches to cpu0 and rq becomes c0000000142a6100, this is after context_switch(), barrier(), finish_task_switch(), finish_lock_switch() For some reason, the _spin_unlock_irq() call from finish_lock_switch() happens on cpu1? But this should be no real problem, the real thing is, why changed rq from c0000000142ae100 to c0000000142a6100 between lines 3002 and 3029. Created attachment 10123 [details] PR24644-2.tar.bz2 rq does change between the context switch. <7>schedule(3025) swapper(0):c1,j4294892318 p c00000000ffd7040 n c00000000ffd2040 r c0000000142ae100 <7>schedule(3027) swapper(1):c0,j4294892318 p c00000000ffd7040 n c00000000ffd7810 r c0000000142a6100 c0000000142ae100 is for cpu1, while c0000000142a6100 is for cpu0. cpu1 was never active before, so all task switches were ok. Created attachment 10135 [details] PR24644-3.tar.bz2 revision 102096 works, revision 102098 doesnt. copying the sched.o from the 102098 tree into the 102096 tree doesnt work either. diff -purN gcc-mainline-music-r102096/ChangeLog gcc-mainline-music-r102098/ChangeLog --- gcc-mainline-music-r102096/ChangeLog 2005-07-16 02:30:53.000000000 +0000 +++ gcc-mainline-music-r102098/ChangeLog 2005-07-16 18:56:53.000000000 +0000 @@ -1,3 +1,53 @@ +2005-07-16 Danny Berlin <dberlin@dberlin.org> + Kenneth Zadeck <zadeck@naturalbridge.com> + + * Makefile.in: Added rules for ipa-pure-const.c, ipa-reference.c, + ipa-reference.h, ipa-utils.c, ipa-utils.h, ipa-type-escape.c, + ipa-type-escape.h, tree-promote-statics.c ... Adding myself to the CC list. Created attachment 10136 [details] PR24644-4.tar.bz2 object files and extracted asm functions for easier comparsion From IRC, the problem seems to be that gcc is copying the value out of a global register variable, which it was not doing before. This is not acceptable, as the global register may be changed on any context switch, and it is important to always use the current value each time it is dereferenced. Declaring the global register variable as volatile may fix the problem, although one then gets an annoying unconditional warning every time you run the compiler ("volatile register variables don't work as you might wish"). The volatile qualifier is recorded at the tree level, but it is not recorded at the RTL level. At the RTL level global registers are sometimes treated as volatile (e.g., hash_rtx in cse.c) and sometimes not (e.g., they are not noticed by volatile_refs_p, and thus are permitted in the input instructions in combine). So some fixes that come to mind are, in order of increasing difficulty: 1) Change the source code to declare the variable volatile, and provide an option to disable the warning. 2) Change global registers to be volatile by default in the C/C++ frontends, and move the warning to there. 3) Change the source code to declare the variable volatile, and fix the RTL level to handle volatile register variables correctly. I've determined why we're dying, but not sure who is at fault yet. While scanning through Olaf's assembly diff's, I noticed some code that normally wouldn't be a problem in user code, but is problamatic in PPC64 kernel code: New code: mr r24, r13 [snip] ld r9,360(r24) Old code: [snip] ld r9, 360(r13) The problem here is that r13 is a reserved register for the kernel (local_paca pointer) and can change values on a context switch if we migrate to another processor. so copying the value out of r13 is a no no. Given what Olaf was seeing, this seems to be the case here. The problamatic source line is from linux-2.6.X/include/asm-ppc64/paca.h: register struct paca_struct *local_paca asm ("r13"); Changing this to: register struct paca_struct * volatile local_paca asm ("r13"); gave this warning message: warning: volatile register variables don't work as you might wish It's only a warning and looking at the disassembly, the code looks good. Olaf, can you add the volatile above an try booting again? Talking with Paulus, he isn't too trilled about the warning message which will occur for every file that uses current (lots!) so we're still not sure if there is a different kernel fix/tweek for this to whether gcc needs to change or both. Yes, adding the volatile fixes it for me. Why does gcc make copies of r13 anyway, when it could just use it where needed? gcc is free to copy non-volatile variables. It does this in many places and for many reasons, particularly in the new SSA optimizers in 4.0 and later. I don't know the details of where it decided to copy the register in this test case, but presumably it seemed profitable to some heuristic. I expect that it could happen if the usage is in a loop, or used on both branches of an if conditional, etc. For completeness, here is a minimal test case that shows the problem we are having: bergner@vervain:~/olaf/PR24644-4> cat bar.c register struct paca_struct * local_paca asm("r13"); struct paca_struct { volatile long state; }; extern void foo(void); int migration_thread(void) { local_paca->state = 1; foo(); local_paca->state = 0; return 0; } bergner@vervain:~/olaf/PR24644-4> ~/gcc/dataflow-base/bin/gcc -m64 -O1 -S bar.c The code we get looks like: .migration_thread: mflr 0 std 29,-24(1) std 0,16(1) stdu 1,-144(1) mr 29,13 li 0,1 std 0,0(13) bl .foo nop li 0,0 std 0,0(29) li 3,0 addi 1,1,144 ld 0,16(1) mtlr 0 ld 29,-24(1) blr This looks like gcc thinks r13 will be trashed by the call to foo() so it copies it to a non volatile register for use after the call. Strange that this didn't happen before the patch found by Olaf. (In reply to comment #15) > This looks like gcc thinks r13 will be trashed by the call to foo() so it > copies it to a non volatile register for use after the call. Strange that this > didn't happen before the patch found by Olaf. Actually it is the opposite way. GCC thinks that r13 will not trash (clobber) it, so it copies it. I wonder why this happens as I think it should be marked as call clobber. Variable: local_paca, UID 1270, struct paca_struct *, is global, call clobbered, default def: local_paca_1 But we get: # TMT.6_8 = V_MAY_DEF <TMT.6_7>; # VUSE <local_paca_1>; foo (); I don't understand why because to me r13 should be able to change in foo so we should get V_MAY_DEF there. That testcase shows that we have a big problem, as local_paca (aka r13) should be marked as call clobbered. The problem is that ipa-static is looking at the variable because the front-end marked it as a static variable. Daniel Berlin has sugggested a fix, and I will test it with the reduced test case and then if you could test it with the full kernel that would be nice. Created attachment 10147 [details]
Patch to try
This patch should fix the issue in the reduced testcase but I don't know if it fixes it fully, Can one of you two test it with the kernel?
yes, this patch helps. tested with r106499. Hmm, my patch is not fully correct but the C++ front-end gets this correct. this patch works, tested with r106530 Fixed. Subject: Bug 24644 Author: pinskia Date: Wed Nov 9 12:33:59 2005 New Revision: 106693 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106693 Log: 2005-11-09 Andrew Pinski <pinskia@physics.uc.edu> PR c/24644 * dwarf2-out.c (add_name_and_src_coords_attributes): Don't add a linkage name for a variable if it a register variable. * c-decl.c (grokdeclarator): Global register variables should be set as PUBLIC. Modified: trunk/gcc/ChangeLog trunk/gcc/c-decl.c trunk/gcc/dwarf2out.c Subject: Bug 24644 Author: dje Date: Fri Nov 11 17:59:54 2005 New Revision: 106791 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106791 Log: PR 24644 * common.opt (Wvolatile-register-var): New. * varasm.c (make_decl_rtl): Only emit warning when option specified. Clarify warning message. * doc/invoke.texi (Wvolatile-register-var): Document new option. * doc/md.texi (copysign): Document standard named pattern. Modified: trunk/gcc/ChangeLog trunk/gcc/common.opt trunk/gcc/doc/invoke.texi trunk/gcc/doc/md.texi trunk/gcc/varasm.c |