Bug 48623

Summary: gcc 4.6.0 generates no code for a function with __attribute__((always_inline))
Product: gcc Reporter: Richard Weinberger <richard>
Component: cAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED INVALID    
Severity: major CC: matz
Priority: P3    
Version: 4.6.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2011-04-15 20:11:31
Attachments: Testcase for gcc 4.6.0
objdump of softirq.o
objdump of __local_bh_enable
preprocessed __local_bh_enable function
preprocessed softirq.c

Description Richard Weinberger 2011-04-15 14:15:38 UTC
gcc 4.6.0 builds a non-functional User Mode Linux kernel.
It seems to optimize away the function sub_preempt_count() used in kernel/softirq.c:__local_bh_enable().
Thus, the preempt counter gets out of balance and the kernel crashes.

A standalone test case is attached.
Just compile it with:
gcc testcase.c -Os -g -c -o testcase.o

Using objdump you can see that no code was generated for sub_preempt_count().
gcc 4.3, 4.4 and 4.5 generate code.
Without __attribute__((always_inline)) 4.6.0 produces code...

Thanks,
//richard
Comment 1 Richard Weinberger 2011-04-15 14:16:22 UTC
Created attachment 23995 [details]
Testcase for gcc 4.6.0
Comment 2 Richard Biener 2011-04-15 14:46:32 UTC
Because current_thread_info() returns garbage (an address derived from an
address of a stack local).
Comment 3 Richard Biener 2011-04-15 14:52:04 UTC
Instead using

static inline struct thread_info *current_thread_info(void)
{
  struct thread_info *ti;
  void *p;
  asm volatile ("" : "=r" (p) : "0" (&ti));
  ti = (struct thread_info *) (((unsigned long) p) & ~mask);
  return ti;
}

might confuse GCC enough and is still architecture independent.
Comment 4 Richard Weinberger 2011-04-15 17:34:33 UTC
Created attachment 24000 [details]
objdump of softirq.o
Comment 5 Richard Weinberger 2011-04-15 17:37:23 UTC
(In reply to comment #3)

It's not that easy.
Your trick solves the problem only for the test case.

Within the kernel again no code has been produced.
I have the objdump of the __local_bh_enable function attached.
See line 86.

Sorry for not providing a standalone test.

Here you can see the source code of __local_bh_enable, it's a pretty simple function.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=kernel/softirq.c;h=174f976c2874a19f1d06fee972468e2c730bc7f9;hb=HEAD#l134

Thanks,
//richard
Comment 6 Richard Weinberger 2011-04-15 17:38:55 UTC
Created attachment 24001 [details]
objdump of __local_bh_enable
Comment 7 Richard Biener 2011-04-15 20:11:31 UTC
Please provide preprocessed source for the translation unit that has the
broken function.
Comment 8 Richard Weinberger 2011-04-15 20:26:03 UTC
Created attachment 24006 [details]
preprocessed __local_bh_enable function
Comment 9 Richard Weinberger 2011-04-15 20:26:48 UTC
Created attachment 24007 [details]
preprocessed softirq.c
Comment 10 Michael Matz 2011-04-15 22:03:05 UTC
You didn't change the current_thread_info carefully enough as per
comment #3.  It still reads:

static inline __attribute__((always_inline)) struct thread_info *current_thread_info(void)
{
 struct thread_info *ti;
 void *p;
 unsigned long mask = ((1 << 0) * ((1UL) << 12)) - 1;
 asm volatile ("" : "=r" (p) : "0" (&ti));
 ti = (struct thread_info *) (((unsigned long) &ti) & ~mask);
 return ti;
}

You have to make use of 'p' of course.  Your return value still is based
on &ti.
Comment 11 Richard Weinberger 2011-04-15 22:27:19 UTC
(In reply to comment #10)
> You have to make use of 'p' of course.  Your return value still is based
> on &ti.

Damn, you're right!

Now gcc produces a functional UML kernel. :-)

I fear current_thread_info() is not the only function in the kernel
which returns a local stack address.

Thanks,
//richard