[PATCH, i386, Pointer Bounds Checker 31/x] Pointer Bounds Checker builtins for i386 target
Uros Bizjak
ubizjak@gmail.com
Wed Sep 17 18:07:00 GMT 2014
On Wed, Sep 17, 2014 at 6:31 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>> >> I don't like the way arguments are prepared. For the case above,
>> >> bnd_ldx should have index_register_operand predicate in its pattern,
>> >> and this predicate (and its mode) should be checked in the expander
>> >> code. There are many examples of argument expansion in
>> >> ix86_expand_builtin function, including how Pmode is handled.
>> >>
>> >> Also, please see how target is handled there. Target can be null, so
>> >> REG_P predicate will crash.
>> >>
>> >> You should also select insn patterns depending on BNDmode, not TARGET_64BIT.
>> >>
>> >> Please use assign_386_stack_local so stack slots can be shared.
>> >> SLOT_TEMP is intended for short-lived temporaries, you can introduce
>> >> new slots if you need more live values at once.
>> >>
>> >> Uros.
>> >
>> > Thanks for comments! Here is a new version in which I addressed all your concerns.
>>
>> Unfortunately, it doesn't. The patch only fixed one instance w.r.t to
>> target handling, the one I referred as an example. You still have
>> unchecked target, at least in IX86_BUILTIN_BNDMK.
>>
>> However, you have a general problems in your builtin expansion code,
>> so please look at how other builtins are handled. E.g.:
>>
>> if (optimize || !target
>> || GET_MODE (target) != tmode
>> || !register_operand(target, tmode))
>> target = gen_reg_rtx (tmode);
>>
>> also, here is an example how input operands are prepared:
>>
>> op0 = expand_normal (arg0);
>> op1 = expand_normal (arg1);
>> op2 = expand_normal (arg2);
>> if (!register_operand (op0, Pmode))
>> op0 = ix86_zero_extend_to_Pmode (op0);
>> if (!register_operand (op1, SImode))
>> op1 = copy_to_mode_reg (SImode, op1);
>> if (!register_operand (op2, SImode))
>> op2 = copy_to_mode_reg (SImode, op2);
>>
>> So, Pmode is handled in a special way, even when x32 is not considered.
>>
>> BTW: I wonder if word_mode is needed here, Pmode can be SImode with
>> address prefix (x32).
>>
>> Inside the expanders, please use expand_simple_binop and expand_unop
>> on RTX, not tree expressions. Again, please see many examples.
>
> Thank you for additional explanations. Hope this time I answer your concerns correctly :)
Yes, this version is MUCH better. There are further comments down the code.
> 2014-09-17 Ilya Enkovich <ilya.enkovich@intel.com>
>
> * config/i386/i386-builtin-types.def (BND): New.
> (ULONG): New.
> (BND_FTYPE_PCVOID_ULONG): New.
> (VOID_FTYPE_BND_PCVOID): New.
> (VOID_FTYPE_PCVOID_PCVOID_BND): New.
> (BND_FTYPE_PCVOID_PCVOID): New.
> (BND_FTYPE_PCVOID): New.
> (BND_FTYPE_BND_BND): New.
> (PVOID_FTYPE_PVOID_PVOID_ULONG): New.
> (PVOID_FTYPE_PCVOID_BND_ULONG): New.
> (ULONG_FTYPE_VOID): New.
> (PVOID_FTYPE_BND): New.
> * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h.
> (ix86_builtins): Add
> IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
> IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
> IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
> IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
> IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
> IX86_BUILTIN_BNDUPPER.
> (builtin_isa): Add leaf_p and nothrow_p fields.
> (def_builtin): Initialize leaf_p and nothrow_p.
> (ix86_add_new_builtins): Handle leaf_p and nothrow_p
> flags.
> (bdesc_mpx): New.
> (bdesc_mpx_const): New.
> (ix86_init_mpx_builtins): New.
> (ix86_init_builtins): Call ix86_init_mpx_builtins.
> (ix86_emit_cmove): New.
> (ix86_emit_move_max): New.
> (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK,
> IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
> IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
> IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW,
> IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF,
> IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.
> * config/i386/i386.h (ix86_stack_slot): Added SLOT_BND_STORED.
..
> + /* We need to move bounds to memory before any computations. */
> + if (!MEM_P (op1))
> + {
> + m1 = assign_386_stack_local (BNDmode, SLOT_TEMP);
> + emit_move_insn (m1, op1);
> + }
> + else
> + m1 = op1;
No negative conditions, please. Just swap the arms of if sentence. It
is much more readable.
> +
> + /* Generate mem expression to be used for access to LB and UB. */
> + m1h1 = gen_rtx_MEM (Pmode, XEXP (m1, 0));
> + m1h2 = gen_rtx_MEM (Pmode, plus_constant (Pmode, XEXP (m1, 0),
> + GET_MODE_SIZE (Pmode)));
Please use adjust_address instead of manually producing MEMs.
> +
> + t1 = gen_reg_rtx (Pmode);
> +
> + /* Compute LB. */
> + emit_move_insn (t1, m1h1);
> + ix86_emit_move_max (t1, lb);
> + emit_move_insn (m1h1, t1);
> +
> + /* Compute UB. UB is stored in 1's complement form. Therefore
> + we also use max here. */
> + emit_move_insn (t1, m1h2);
> + ix86_emit_move_max (t1, ub);
> + emit_move_insn (m1h2, t1);
> +
> + op2 = gen_reg_rtx (BNDmode);
> + emit_move_insn (op2, m1);
> +
> + return chkp_join_splitted_slot (lb, op2);
> + }
> +
> + case IX86_BUILTIN_BNDINT:
The handling in this builtin looks strange.
I suggest to do it in a serial way, like this:
if (MEM_P (op0)
m0 = op0;
else
{
m0 = stack (SLOT_TEMP)
move (op0, m0)
}
move parts of m0 to temporaries.
if (MEM_P (op1)
m1 = op1;
else
{
m1 = stack (SLOT_TEMP)
move (op1, m1)
}
move parts of m1 to another temporaries.
Process temporaries.
res = stack (SLOT_BND_STORED)
move calculated stuff to res.
This will ensure that nobody will clobber your stack slot with the
result, assuming consumer will soon use it. SLOT_TEMP is a short-lived
slot, and can be reused.
> + {
> + unsigned bndsize = GET_MODE_SIZE (BNDmode);
> + unsigned psize = GET_MODE_SIZE (Pmode);
> + rtx res, m1, m2, m1h1, m1h2, m2h1, m2h2, t1, t2, rh1, rh2;
> +
> + arg0 = CALL_EXPR_ARG (exp, 0);
> + arg1 = CALL_EXPR_ARG (exp, 1);
> +
> + op0 = expand_normal (arg0);
> + op1 = expand_normal (arg1);
> +
> + /* We need to move bounds to memory before any computations. */
> + if (!MEM_P (op0))
> + {
> + m1 = assign_386_stack_local (BNDmode, SLOT_TEMP);
> + emit_move_insn (m1, op0);
> + }
> + else
> + m1 = op0;
> +
> + if (!MEM_P (op1))
> + {
> + m2 = assign_386_stack_local (BNDmode,
> + MEM_P (op0)
> + ? SLOT_TEMP
> + : SLOT_BND_STORED);
> + emit_move_insn (m2, op1);
> + }
> + else
> + m2 = op1;
> +
> + if (!MEM_P (op0))
> + res = m1;
> + else if (!MEM_P (op1))
> + res = m2;
> + else
> + res = assign_386_stack_local (BNDmode, SLOT_TEMP);
> + /* Generate mem expression to be used for access to LB and UB. */
> + m1h1 = gen_rtx_MEM (Pmode, XEXP (m1, 0));
> + m1h2 = gen_rtx_MEM (Pmode, plus_constant (Pmode, XEXP (m1, 0), psize));
> + m2h1 = gen_rtx_MEM (Pmode, XEXP (m2, 0));
> + m2h2 = gen_rtx_MEM (Pmode, plus_constant (Pmode, XEXP (m2, 0), psize));
> + rh1 = gen_rtx_MEM (Pmode, XEXP (res, 0));
> + rh2 = gen_rtx_MEM (Pmode, plus_constant (Pmode, XEXP (res, 0), psize));
Please use adjust_address here.
> +
> + /* Allocate temporaries. */
> + t1 = gen_reg_rtx (Pmode);
> + t2 = gen_reg_rtx (Pmode);
> +
> + /* Compute LB. */
> + emit_move_insn (t1, m1h1);
> + emit_move_insn (t2, m2h1);
> + ix86_emit_move_max (t1, t2);
> + emit_move_insn (rh1, t1);
> +
> + /* Compute UB. UB is stored in 1's complement form. Therefore
> + we also use max here. */
> + emit_move_insn (t1, m1h2);
> + emit_move_insn (t2, m2h2);
> + ix86_emit_move_max (t1, t2);
> + emit_move_insn (rh2, t1);
> +
> + return res;
> + }
> +
> + case IX86_BUILTIN_SIZEOF:
> + {
> + enum machine_mode mode = Pmode;
No need for the above temporary...
> + tree name;
> + rtx temp;
> +
> + if (!target
> + || GET_MODE (target) != Pmode
> + || !register_operand (target, Pmode))
> + target = gen_reg_rtx (Pmode);
> +
> + arg0 = CALL_EXPR_ARG (exp, 0);
> + gcc_assert (TREE_CODE (arg0) == VAR_DECL);
> +
> + name = DECL_ASSEMBLER_NAME (arg0);
> + temp = gen_rtx_SYMBOL_REF (Pmode, IDENTIFIER_POINTER (name));
> + temp = gen_rtx_UNSPEC (mode, gen_rtvec (1, temp), UNSPEC_SIZEOF);
Call expander directly, use target as an operand 0. You won't need move below.
> + emit_move_insn (target, temp);
> +
> + return target;
> + }
> +
> + case IX86_BUILTIN_BNDLOWER:
> + {
> + rtx mem, hmem;
> +
> + if (!target
> + || GET_MODE (target) != Pmode
> + || !register_operand (target, Pmode))
> + target = gen_reg_rtx (Pmode);
> +
> + arg0 = CALL_EXPR_ARG (exp, 0);
> + op0 = expand_normal (arg0);
> +
> + /* We need to move bounds to memory first. */
> + if (!MEM_P (op0))
> + {
> + mem = assign_386_stack_local (BNDmode, SLOT_TEMP);
> + emit_move_insn (mem, op0);
> + }
> + else
> + mem = op0;
No negative conditions.
> + /* Generate mem expression to access LB and load it. */
> + hmem = gen_rtx_MEM (Pmode, XEXP (mem, 0));
adjust_address again.
> + emit_move_insn (target, hmem);
> +
> + return target;
> + }
> +
> + case IX86_BUILTIN_BNDUPPER:
> + {
> + rtx mem, hmem;
> +
> + if (!target
> + || GET_MODE (target) != Pmode
> + || !register_operand (target, Pmode))
> + target = gen_reg_rtx (Pmode);
> +
> + arg0 = CALL_EXPR_ARG (exp, 0);
> + op0 = expand_normal (arg0);
> +
> + /* We need to move bounds to memory first. */
> + if (!MEM_P (op0))
> + {
> + mem = assign_386_stack_local (BNDmode, SLOT_TEMP);
> + emit_move_insn (mem, op0);
> + }
> + else
> + mem = op0;
No negative conditions.
> + /* Generate mem expression to access UB and load it. */
> + hmem = gen_rtx_MEM (Pmode,
> + gen_rtx_PLUS (Pmode, XEXP (mem, 0),
> + GEN_INT (GET_MODE_SIZE (Pmode))));
adjust_address again.
> + emit_move_insn (target, hmem);
> +
> + /* We need to inverse all bits of UB. */
> + emit_move_insn (target, gen_rtx_NOT (Pmode, target));
Use emit_simple_unop here.
> +
> + return target;
> + }
> +
> case IX86_BUILTIN_MASKMOVQ:
> case IX86_BUILTIN_MASKMOVDQU:
> icode = (fcode == IX86_BUILTIN_MASKMOVQ
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index a38c5d1..ededa67 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2317,6 +2317,7 @@ enum ix86_stack_slot
> SLOT_CW_FLOOR,
> SLOT_CW_CEIL,
> SLOT_CW_MASK_PM,
> + SLOT_BND_STORED,
> MAX_386_STACK_LOCALS
> };
>
Uros.
More information about the Gcc-patches
mailing list