Bug 41246

Summary: should "sorry" when regparm=3 and nested functions are encountered
Product: gcc Reporter: Robert Millan <rmh.gcc>
Component: targetAssignee: Richard Henderson <rth>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, rth
Priority: P3 Keywords: wrong-code
Version: 4.5.0   
Target Milestone: ---   
Host: Target: i?86-*-*
Build: Known to work:
Known to fail: Last reconfirmed: 2009-09-14 20:40:23
Attachments: test case
gcc45-pr41246.patch
patch erroring out
trampoline push, version 1

Description Robert Millan 2009-09-03 15:32:57 UTC
On i386, combining -regparm=3 with nested functions results in %ecx (holding the third function parameter because of -regparm=3) being overwritten with a temporary pointer used by GCC to implement the nested function call.

GNU GRUB has had a workaround (disabling -regparm in a case-by-case basis) to this bug for a long while.  It's only been brought up to me recently that we had this problem (I'm one of the GRUB maintainers).

One of the GRUB developers (Bean) produced a test case.  I'm attaching it here.
Comment 1 Robert Millan 2009-09-03 15:34:31 UTC
Created attachment 18477 [details]
test case

$ gcc -m32 regparm_nested.c && ./a.out
10 31 31

$ gcc -m32 -mregparm=3 regparm_nested.c && ./a.out
10 31 -3957600
Comment 2 Jakub Jelinek 2009-09-03 16:29:17 UTC
Created attachment 18480 [details]
gcc45-pr41246.patch

I'm afraid there is nothing that can make that unmodified testcase work for -mregparm=3.
It is true that gcc is buggy and doesn't force regparm(2) for nested functions with -mregparm=3 (regparm(3) is impossible, due to the static chain in %ecx), we only error if explicit __attribute__((regparm (3))) is used on nested functions.
But if we silently change the call to regparm(2), there is no way for the compiler to know whether the hook is using regparm(3) or regparm(2) calling convention (if you pass it a non-nested function, it will be regparm(3), if you pass a nested function, it will be regparm(2).  So, perhaps we should just error out whenever we see in -m32 mode on i?86/x86_64 a nested function in -mregparm=3 mode.
Comment 3 Richard Biener 2009-09-03 16:40:12 UTC
Yes, I think we should sorry() here.
Comment 4 Robert Millan 2009-09-03 16:47:24 UTC
(In reply to comment #2)
> So, perhaps we should just
> error out whenever we see in -m32 mode on i?86/x86_64 a nested function in
> -mregparm=3 mode.

Error out is much nicer.  I can recall nasty memory corruption bugs that could've been solved in a blast if GCC rejected our code :-)
Comment 5 Paolo Bonzini 2009-09-03 17:11:30 UTC
Yes, if you use nested functions you can just use -mregparm=2 globally, that's a much better solution.
Comment 6 Paolo Bonzini 2009-09-03 17:15:26 UTC
Created attachment 18482 [details]
patch erroring out
Comment 7 Jakub Jelinek 2009-09-03 17:19:39 UTC
The comment needs adjusting...
Comment 8 Uroš Bizjak 2009-09-03 17:21:50 UTC
(In reply to comment #3)
> Yes, I think we should sorry() here.

Why sorry? It is an error.

Using sorry, the compiler will say: "sorry, unimplemented: <something>" which is just confusing.
Comment 9 Richard Biener 2009-09-03 17:28:29 UTC
We could implement it by not using %ecx for nested function support, so it's
not implemented vs. an error (IMHO).  AFAIK that we use %ecx is completely
internal as the nested function cannot be exported.
Comment 10 Richard Biener 2009-09-03 17:30:01 UTC
Or similar just using -mregparm=2 for that nested function.  I doubt we specify
the ABI for them somewhere.
Comment 11 graham.stott 2009-09-03 18:01:48 UTC
Subject: Re:  should "sorry" when regparm=3 and nested functions are encountered

All,

nested functions get passed a hidden argumment akin to static link/display
so that nested function can access the locals of its enclosing function.

Passing a nested function as parameter to another function isn't going 
to work correctly when the function is eventually called the
hidden argument isn't going to be setup correctly.

I think the code is thus invalid and GCC should reject it.

Regards
Graham

--- On Thu, 3/9/09, bonzini at gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote:

> From: bonzini at gnu dot org <gcc-bugzilla@gcc.gnu.org>
> Subject: [Bug target/41246] should "sorry" when regparm=3 and nested functions are encountered
> To: gcc-bugs@gcc.gnu.org
> Date: Thursday, 3 September, 2009, 6:11 PM
> 
> 
> ------- Comment #5 from bonzini at gnu dot org 
> 2009-09-03 17:11 -------
> Yes, if you use nested functions you can just use
> -mregparm=2 globally, that's
> a much better solution.
> 
> 
> -- 
> 
> bonzini at gnu dot org changed:
> 
>            What 
>   |Removed           
>          |Added
> ----------------------------------------------------------------------------
>          
>    Status|UNCONFIRMED     
>            |NEW
>      Ever Confirmed|0   
>                
>        |1
>    Last reconfirmed|0000-00-00
> 00:00:00         |2009-09-03
> 17:11:30
>            
>    date|         
>                
>   |
>             Summary|%ecx
> corruption when        |should "sorry"
> when
>                
>    |combining -regparm=3
> with   |regparm=3 and nested
>                
>    |nested functions     
>       |functions are encountered
> 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41246
> 
>


Comment 12 Andrew Pinski 2009-09-03 18:05:23 UTC
(In reply to comment #11)
> Subject: Re:  should "sorry" when regparm=3 and nested functions are
> encountered
> 
> All,
> 
> nested functions get passed a hidden argumment akin to static link/display
> so that nested function can access the locals of its enclosing function.
> 
> Passing a nested function as parameter to another function isn't going 
> to work correctly when the function is eventually called the
> hidden argument isn't going to be setup correctly.
> 
> I think the code is thus invalid and GCC should reject it.

It is valid as GCC creates a trampoline for the function pointer ...
Comment 13 Vladimir 'phcoder' Serbinenko 2009-09-05 20:34:52 UTC
(In reply to comment #2)
> Created an attachment (id=18480) [edit]
> gcc45-pr41246.patch
> 
> I'm afraid there is nothing that can make that unmodified testcase work for
> -mregparm=3.
> It is true that gcc is buggy and doesn't force regparm(2) for nested functions
> with -mregparm=3 (regparm(3) is impossible, due to the static chain in %ecx),
Could this chain perhaps be moved to %edx?
> we only error if explicit __attribute__((regparm (3))) is used on nested
> functions.
> But if we silently change the call to regparm(2), there is no way for the
> compiler to know whether the hook is using regparm(3) or regparm(2) calling
> convention 
If you pass a pointer to nested function you actually pass a pointer to a trampoline. Could trampoline just push %ecx as its first operation? this way trampoline would "convert" regparm(2) to regparm(3) function
(if you pass it a non-nested function, it will be regparm(3), if you
> pass a nested function, it will be regparm(2).  So, perhaps we should just
> error out whenever we see in -m32 mode on i?86/x86_64 a nested function in
> -mregparm=3 mode.
> 

Comment 14 Richard Henderson 2009-09-09 21:23:35 UTC
Not working on it yet, but I think it should be possible to have the
static chain pushed to the stack by the trampoline.  Direct calls to
the nested function would have to pass the static chain in a call
saved register.  The nested function would have to be emitted like:

nested_func:
  push %esi
.Lnested_func_tramp_entry:
  push %ebp
  movl %esp, %ebp
  ...

trampoline:
  pushl $static_chain
  jmpl .Lnested_func_tramp_entry

Thankfully, "pushl $const" and "movl $const, reg" are the same size.

This involves adjusting a number of elimination offsets, rearranging the
middle-end such that it allows different static chain locations based on
the target function, rearranging the INITIALIZE_TRAMPOLINE macro so that
it passes in the target function.  Hopefully with all the macros turned
into proper target hooks at the same time.

The one potential problem with this approach is that -fno-omit-frame-pointer
backtraces break.  However, dwarf2 unwinding would still work.
Comment 15 Richard Henderson 2009-09-09 21:28:58 UTC
(In reply to comment #13)
> Could this chain perhaps be moved to %edx?

No, regparm(3) uses all of %eax, %edx, %ecx.  There are no other
available call-clobbered registers.

Incidentally, fastcall functions are currently broken as well,
though for them %eax is available as a scratch.

> Could trampoline just push %ecx as its first operation? this way
> trampoline would "convert" regparm(2) to regparm(3) function

No.  Any change that affects the stack cannot be done in the 
trampoline only, as the stack change needs to be undone in 
the nested function itself.
Comment 16 Richard Henderson 2009-09-11 23:15:31 UTC
Mine.
Comment 17 Richard Henderson 2009-09-12 00:00:07 UTC
Created attachment 18570 [details]
trampoline push, version 1

Here's a lightly tested patch that implements the idea in comment #14.
Will those that care about this topic please give this a workout over
the weekend, and I'll finish it up next week.
Comment 18 Richard Henderson 2009-09-16 16:27:11 UTC
Subject: Bug 41246

Author: rth
Date: Wed Sep 16 16:26:55 2009
New Revision: 151762

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151762
Log:
PR target/41246
        * tree-cfg.c (verify_gimple_call): Validate that
        * gimple_call_chain
        is set only if DECL_NO_STATIC_CHAIN is unset.
        * tree-nested.c (iter_nestinfo_start, iter_nestinfo_next): New.
        (FOR_EACH_NEST_INFO): New.
        (walk_all_functions): Use it.
        (finalize_nesting_tree): Likewise.
        (unnest_nesting_tree): Likewise.
        (free_nesting_tree): Use iter_nestinfo_start, iter_nestinfo_next.
        (get_chain_decl, get_chain_field): Reset DECL_NO_STATIC_CHAIN.
        (convert_gimple_call): Early out if gimple_call_chain already set.
        (convert_all_function_calls): Iterate until no new functions
        require a static chain.
        (finalize_nesting_tree_1): Assert DECL_NO_STATIC_CHAIN is unset
        when building a trampoline.  Use dump_function_to_file instead
        of dump_function.
        (lower_nested_functions): Open dump_file.  Validate that decls
        that have DECL_NO_STATIC_CHAIN from the front end don't have that
        bit reset by this pass.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-cfg.c
    trunk/gcc/tree-nested.c

Comment 19 Richard Henderson 2009-09-22 15:12:06 UTC
Subject: Bug 41246

Author: rth
Date: Tue Sep 22 15:11:37 2009
New Revision: 151983

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151983
Log:
	PR target/41246
	* target.h (struct gcc_target): Add asm_out.trampoline_template,
	calls.static_chain, calls.trampoline_init,
	calls.trampoline_adjust_address.
	* target-def.h (TARGET_ASM_TRAMPOLINE_TEMPLATE): New.
	(TARGET_STATIC_CHAIN, TARGET_TRAMPOLINE_INIT): New.
	(TARGET_TRAMPOLINE_ADJUST_ADDRESS): New.
	* builtins.c (expand_builtin_setjmp_receiver): Use
	targetm.calls.static_chain; only clobber registers.
	(expand_builtin_init_trampoline): Use targetm.calls.trampoline_init;
	set up memory attributes properly for the trampoline block.
	(expand_builtin_adjust_trampoline): Use
	targetm.calls.trampoline_adjust_address.
	* calls.c (prepare_call_address): Add fndecl argument.  Use
	targetm.calls.static_chain.
	* df-scan.c (df_need_static_chain_reg): Remove.
	(df_get_entry_block_def_set): Use targetm.calls.static_chain;
	consolodate static chain handling.
	* doc/tm.texi: Document new hooks.
	* emit-rtl.c (static_chain_rtx, static_chain_incoming_rtx): Remove.
	(init_emit_regs): Don't initialize them.
	* expr.h (prepare_call_address): Update decl.
	* final.c (profile_function): Use targetm.calls.static_chain.
	* function.c (expand_function_start): Likewise.
	* rtl.h (static_chain_rtx, static_chain_incoming_rtx): Remove.
	* stmt.c (expand_nl_goto_receiver): Use targetm.calls.static_chain;
	only clobber registers.
	* targhooks.c (default_static_chain): New.
	(default_asm_trampoline_template, default_trampoline_init): New.
	(default_trampoline_adjust_address): New.
	* targhooks.h: Declare them.
	* varasm.c (assemble_trampoline_template): Use
	targetm.asm_out.trampoline_template.  Make the memory block const
	and set its size.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/builtins.c
    trunk/gcc/calls.c
    trunk/gcc/config/darwin.h
    trunk/gcc/config/netbsd.h
    trunk/gcc/df-scan.c
    trunk/gcc/doc/tm.texi
    trunk/gcc/emit-rtl.c
    trunk/gcc/expr.h
    trunk/gcc/final.c
    trunk/gcc/function.c
    trunk/gcc/rtl.h
    trunk/gcc/stmt.c
    trunk/gcc/target-def.h
    trunk/gcc/target.h
    trunk/gcc/targhooks.c
    trunk/gcc/targhooks.h
    trunk/gcc/varasm.c

Comment 20 Richard Henderson 2009-09-22 15:17:02 UTC
Subject: Bug 41246

Author: rth
Date: Tue Sep 22 15:16:49 2009
New Revision: 152018

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152018
Log:
	PR target/41246
	* config/i386/i386.c (ix86_function_regparm): Do not issue an
	error for nested functions with regparm=3.
	(ix86_compute_frame_layout): Adjust frame pointer offset for
	ix86_static_chain_on_stack.
	(ix86_expand_prologue): Handle ix86_static_chain_on_stack.
	(ix86_emit_restore_reg_using_pop): Increment ix86_cfa_state->offset,
	don't reset to UNITS_PER_WORD.
	(ix86_emit_leave): Adjust ix86_cfa_state.
	(ix86_expand_epilogue): Handle ix86_static_chain_on_stack.
	(ix86_static_chain): New.
	(ix86_trampoline_init): Rename from x86_initialize_trampoline;
	make static; update for target hook parameters; use ix86_static_chain.
	(TARGET_STATIC_CHAIN, TARGET_TRAMPOLINE_INIT): New.
	* config/i386/i386.h (STATIC_CHAIN_REGNUM): Remove.
	(INITIALIZE_TRAMPOLINE): Remove.
	(TRAMPOLINE_SIZE): Use 24 for 64-bit.
	(struct machine_function): Use BOOL_BITFIELD; rearrange bitfields
	to the end.  Add static_chain_on_stack.
	(ix86_static_chain_on_stack): New.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h

Comment 21 Richard Henderson 2009-09-22 15:21:51 UTC
Fixed.