This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix AMD64 handling of functions with huge stack frames


> Hi!
> 
> The following testcases ICE on AMD64.
> I haven't handled indirect sibcall case with huge stack frames yet,
> since current GCC cannot generate such functions (in such functions %r11
> register is already used by the sibcall jump and thus cannot be used for the
> temporary in epilogue).
> The testcases are compile, not execute resp. gcc.dg, since in order to run
> them, one needs (on 64-bit Linux) at least
> echo 1 > /proc/sys/vm/overcommit_memory 
> and
> ulimit -s unlimited
> before executing the testcase.
> 
> Ok to commit?
> 
> 2003-10-23  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* config/i386/i386.c (ix86_expand_prologue): Handle stack
> 	adjustement bigger than 2GB.
> 	(epilogue_adjust_stack): New function.
> 	(ix86_expand_epilogue): Use it.

Actually hammer branch has similar patch.  I realize now that it appears
to be missing on mainline for some reason.  Sorry for that, it would
probably save some of your time.

>      ;
>    else if (! TARGET_STACK_PROBE || allocate < CHECK_STACK_LIMIT)
>      {
> -      insn = emit_insn (gen_pro_epilogue_adjust_stack
> -			(stack_pointer_rtx, stack_pointer_rtx,
> -			 GEN_INT (-allocate)));
> +      rtx disp = GEN_INT (-allocate);
> +      if (! TARGET_64BIT || x86_64_immediate_operand (disp, DImode))
> +	{
> +	  insn = emit_insn (gen_pro_epilogue_adjust_stack
> +			    (stack_pointer_rtx, stack_pointer_rtx, disp));
> +	}
> +      else
> +	{
> +	  /* Most of call used registers are used for parameter passing,
> +	     r10 is used for static chain but r11 shouldn't be live during
> +	     prologue.  */
> +	  rtx r11 = gen_rtx_REG (DImode, FIRST_REX_INT_REG + 3 /* R11 */);
> +	  insn = emit_insn (gen_rtx_SET (DImode, r11, disp));
> +	  RTX_FRAME_RELATED_P (insn) = 1;
> +	  insn = emit_insn (gen_pro_epilogue_adjust_stack_rex64
> +			    (stack_pointer_rtx, stack_pointer_rtx, r11));
> +	}
>        RTX_FRAME_RELATED_P (insn) = 1;
>      }
>    else
> @@ -5145,6 +5159,26 @@ ix86_emit_restore_regs_using_mov (rtx po
>        }
>  }
>  
> +static void
> +epilogue_adjust_stack (rtx dest, rtx src, rtx offset, int style)
> +{
> +  if (! TARGET_64BIT || x86_64_immediate_operand (offset, DImode))
> +    emit_insn (gen_pro_epilogue_adjust_stack (dest, src, offset));
> +  else
> +    {
> +      rtx r11;
> +      /* r11 is used by indirect sibcall return as well, set before the
> +	 epilogue and used after the epilogue.  ATM indirect sibcall
> +	 shouldn't be used together with huge frame sizes in one
> +	 function because of the frame_size check in sibcall.c.  */
> +      if (style == 0)
> +	abort ();
> +      r11 = gen_rtx_REG (DImode, FIRST_REX_INT_REG + 3 /* R11 */);
> +      emit_insn (gen_rtx_SET (DImode, r11, offset));
> +      emit_insn (gen_pro_epilogue_adjust_stack_rex64 (dest, src, r11));
> +    }
> +}

The hammer branch version embeddes the register load in
pro_epilogue_adjust_stack_expander, this seem to avoid new for
duplicating the code..

> --- gcc/config/i386/i386.md.jj	2003-10-23 09:55:07.000000000 +0200
> +++ gcc/config/i386/i386.md	2003-10-23 11:39:41.000000000 +0200
> @@ -17830,9 +17830,9 @@
>     (set_attr "mode" "SI")])
>  
>  (define_insn "pro_epilogue_adjust_stack_rex64"
> -  [(set (match_operand:DI 0 "register_operand" "=r,r")
> -	(plus:DI (match_operand:DI 1 "register_operand" "0,r")
> -		 (match_operand:DI 2 "x86_64_immediate_operand" "e,e")))
> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r")
> +	(plus:DI (match_operand:DI 1 "register_operand" "0,r,r")
> +		 (match_operand:DI 2 "x86_64_general_operand" "e,e,r")))
>     (clobber (reg:CC 17))
>     (clobber (mem:BLK (scratch)))]
>    "TARGET_64BIT"
> @@ -17862,7 +17862,7 @@
>      }
>  }
>    [(set (attr "type")
> -	(cond [(eq_attr "alternative" "0")
> +	(cond [(eq_attr "alternative" "0,2")

I was trying to do the same on hammer branch and it did hit problems
with unwind code not being able to realize that (plus (rsp) (reg2))
is (plus (rsp) (big_constant)), so I had to create new pattern
pro_epilogue_adjust_stack_rex64_2.  DOes this problem still remain in
mainline or it has been fixed?

In the case unwind info is OK, your patch is fine either with the change
moving R11 set to the expander or without that change.

Thank you!
Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]