This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 4/7]: Ping3: Merge from Stack Branch - i386 backend changes
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: "Ye, Joey" <joey dot ye at intel dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Jan Hubicka <hubicka at ucw dot cz>, Richard Guenther <richard dot guenther at gmail dot com>, "Lu, Hongjiu" <hongjiu dot lu at intel dot com>, "Guo, Xuepeng" <xuepeng dot guo at intel dot com>, jason at redhat dot com
- Date: Wed, 28 May 2008 16:48:11 +0200
- Subject: Re: [PATCH 4/7]: Ping3: Merge from Stack Branch - i386 backend changes
- References: <BB577BF501703042AD7E08EADD8E514FF5DD84@pdsmsx415.ccr.corp.intel.com>
> Jan,
>
> Here is most recent i386 backend patch after Richard's review. Can you
> review it when you have time?
#define CAN_ELIMINATE(FROM, TO) \
- ((TO) == STACK_POINTER_REGNUM ? !frame_pointer_needed : 1)
+ (stack_realign_fp \
+ ? ((FROM) == ARG_POINTER_REGNUM && (TO) == HARD_FRAME_POINTER_REGNUM) \
+ || ((FROM) == FRAME_POINTER_REGNUM && (TO) == STACK_POINTER_REGNUM) \
+ : ((TO) == STACK_POINTER_REGNUM ? !frame_pointer_needed : 1))
I would preffer to have this offline with some comments on what cases
are handling. The condition is quite hard to parse.
+/* Find an available register to be used as dynamic realign argument
+ pointer regsiter. Such a register will be written in prologue and
+ used in begin of body, so it must not be
+ 1. parameter passing register.
+ 2. GOT pointer.
+ For i386, we use CX if it is not used to pass parameter. Otherwise
+ we just pick DI.
+ For x86_64, we just pick R13 directly.
+
+ Return: the regno of choosed register. */
+
+static unsigned int
+find_drap_reg (void)
+{
+ int param_reg_num;
+
+ if (TARGET_64BIT)
+ return R13_REG;
Same trick as for alt_pic_regnum would probably work here too to save
some REX prefixes, but it is not big deal and could be handled
incrementally if we care.
+
+ if (stack_realign_drap)
+ {
+ /* Assign DRAP to vDRAP and returns vDRAP */
+ unsigned int regno = find_drap_reg ();
+ rtx drap_vreg;
+ rtx arg_ptr;
+ rtx seq;
+
+ if (regno != CX_REG)
+ crtl->save_param_ptr_reg = true;
+
+ arg_ptr = gen_rtx_REG (Pmode, regno);
+ crtl->drap_reg = arg_ptr;
+
+ start_sequence ();
+ drap_vreg = copy_to_reg(arg_ptr);
+ seq = get_insns ();
+ end_sequence ();
+
+ emit_insn_before (seq, NEXT_INSN (entry_of_function ()));
+ return drap_vreg;
}
Can't be the drap load code emit in more resonable place? It seems
unintiuitive that function that is supposed to just return the reg RTX
is also generating code.
Rest of the patch seems OK. Does the patch have some performance
implications on lets say SPECfp? (i.e. with default setting, the codegen
should be mostly unchanged. Can we set it to align stack at runtime and
drop preferred-stack-boundary to 8 and see how both strategies compare?)
Honza