This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/4] x86: Add -mindirect-branch=


> gcc/
> 
> 	* config/i386/i386-opts.h (indirect_branch): New.
> 	* config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
> 	* config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
> 	with local indirect jump when converting indirect call and jump.
> 	(ix86_set_indirect_branch_type): New.
> 	(ix86_set_current_function): Call ix86_set_indirect_branch_type.
> 	(indirectlabelno): New.
> 	(indirect_thunk_needed): Likewise.
> 	(indirect_thunk_bnd_needed): Likewise.
> 	(indirect_thunks_used): Likewise.
> 	(indirect_thunks_bnd_used): Likewise.
> 	(INDIRECT_LABEL): Likewise.
> 	(indirect_thunk_name): Likewise.
> 	(output_indirect_thunk): Likewise.
> 	(output_indirect_thunk_function): Likewise.
> 	(ix86_output_indirect_branch): Likewise.
> 	(ix86_output_indirect_jmp): Likewise.
> 	(ix86_code_end): Call output_indirect_thunk_function if needed.
> 	(ix86_output_call_insn): Call ix86_output_indirect_branch if
> 	needed.
> 	(ix86_handle_fndecl_attribute): Handle indirect_branch.
> 	(ix86_attribute_table): Add indirect_branch.
> 	* config/i386/i386.h (machine_function): Add indirect_branch_type
> 	and has_local_indirect_jump.
> 	* config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
> 	to true.
> 	(tablejump): Likewise.
> 	(*indirect_jump): Use ix86_output_indirect_jmp.
> 	(*tablejump_1): Likewise.
> 	(simple_return_indirect_internal): Likewise.
> 	* config/i386/i386.opt (mindirect-branch=): New option.
> 	(indirect_branch): New.
> 	(keep): Likewise.
> 	(thunk): Likewise.
> 	(thunk-inline): Likewise.
> 	(thunk-extern): Likewise.
> 	* doc/extend.texi: Document indirect_branch function attribute.
> 	* doc/invoke.texi: Document -mindirect-branch= option.
> 
> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
> index f245c1573cf..f14cbeee7a1 100644
> --- a/gcc/config/i386/i386-opts.h
> +++ b/gcc/config/i386/i386-opts.h
> @@ -106,4 +106,12 @@ enum prefer_vector_width {
>      PVW_AVX512
>  };
>  
> +enum indirect_branch {
> +  indirect_branch_unset = 0,
> +  indirect_branch_keep,
> +  indirect_branch_thunk,
> +  indirect_branch_thunk_inline,
> +  indirect_branch_thunk_extern
> +};
I think it would make sense to simply place the body of your introduction email
here as a comment explaining what enum indirect_branhc does.
>  
> -/* Return true if a red-zone is in use.  */
> +/* Return true if a red-zone is in use.  We can't use red-zone when
> +   there are local indirect jumps, like "indirect_jump" or "tablejump",
> +   which jumps to another place in the function, since "call" in the
> +   indirect thunk pushes the return address onto stack, destroying
> +   red-zone.  */

Technically we can use red-zone if we reserve space for the call address, right?
Perhaps mention it as TODO, that should even not be too hard to implement
but is probably not being that important code quality wise in practice.
>  
>  bool
>  ix86_using_red_zone (void)
>  {
> -  return TARGET_RED_ZONE && !TARGET_64BIT_MS_ABI;
> +  return (TARGET_RED_ZONE
> +	  && !TARGET_64BIT_MS_ABI
> +	  && (!cfun->machine->has_local_indirect_jump
> +	      || cfun->machine->indirect_branch_type == indirect_branch_keep));
>  }
>  
>  /* Return a string that documents the current -m options.  The caller is
> @@ -5797,6 +5804,37 @@ ix86_set_func_type (tree fndecl)
>      }
>  }
>  
> @@ -10639,6 +10681,191 @@ ix86_setup_frame_addresses (void)
>  # endif
>  #endif
>  
> +static int indirectlabelno;
> +static bool indirect_thunk_needed = false;
> +static bool indirect_thunk_bnd_needed = false;
> +
> +static int indirect_thunks_used;
> +static int indirect_thunks_bnd_used;

Add comments for variables.
> +
> +#ifndef INDIRECT_LABEL
> +# define INDIRECT_LABEL "LIND"
> +#endif
> +
> +/* Fills in the label name that should be used for the indirect thunk.  */
> +
> +static void
> +indirect_thunk_name (char name[32], int regno, bool need_bnd_p)
> +{
> +  if (USE_HIDDEN_LINKONCE)
> +    {
> +      const char *bnd = need_bnd_p ? "_bnd" : "";
> +      if (regno >= 0)
> +	{
> +	  const char *reg_prefix;
> +	  if (LEGACY_INT_REGNO_P (regno))
> +	    reg_prefix = TARGET_64BIT ? "r" : "e";
> +	  else
> +	    reg_prefix = "";
> +	  sprintf (name, "__x86_indirect_thunk%s_%s%s",
> +		   bnd, reg_prefix, reg_names[regno]);
> +	}
> +      else
> +	sprintf (name, "__x86_indirect_thunk%s", bnd);
> +    }
> +  else
> +    {
> +      if (regno >= 0)
> +	{
> +	  if (need_bnd_p)
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
> +	  else
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITR", regno);
> +	}
> +      else
> +	{
> +	  if (need_bnd_p)
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LITB", 0);
> +	  else
> +	    ASM_GENERATE_INTERNAL_LABEL (name, "LIT", 0);
> +	}
> +    }
> +}
> +
> +/* Output a call and return thunk for indirect branch.  If BND_P is
> +   true, the BND prefix is needed.   If REGNO != -1,  the function
> +   address is in REGNO.  Otherwise, the function address is on the
> +   top of stack.  */

Add here comment which gives example of full thunk body so it is easier
to see what is going on here.
> +
> +static void
> +output_indirect_thunk (bool need_bnd_p, int regno)
> +{
> +  char indirectlabel1[32];
> +  char indirectlabel2[32];
> +
> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel1, INDIRECT_LABEL,
> +			       indirectlabelno++);
> +  ASM_GENERATE_INTERNAL_LABEL (indirectlabel2, INDIRECT_LABEL,
> +			       indirectlabelno++);
> +
> +  /* Call */
> +  if (need_bnd_p)
> +    fputs ("\tbnd call\t", asm_out_file);
> +  else
> +    fputs ("\tcall\t", asm_out_file);
> +  assemble_name_raw (asm_out_file, indirectlabel2);
> +  fputc ('\n', asm_out_file);
> +
> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
> +
> +  /* Pause .  */
> +  fprintf (asm_out_file, "\tpause\n");
> +
> +  /* Jump.  */
> +  fputs ("\tjmp\t", asm_out_file);
> +  assemble_name_raw (asm_out_file, indirectlabel1);
> +  fputc ('\n', asm_out_file);
> +
> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
> +
> +  if (regno >= 0)
> +    {
> +      /* MOV.  */
> +      rtx xops[2];
> +      xops[0] = gen_rtx_MEM (word_mode, stack_pointer_rtx);
> +      xops[1] = gen_rtx_REG (word_mode, regno);
> +      output_asm_insn ("mov\t{%1, %0|%0, %1}", xops);
> +    }
> +  else
> +    {
> +      /* LEA.  */
> +      rtx xops[2];
> +      xops[0] = stack_pointer_rtx;
> +      xops[1] = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
> +      output_asm_insn ("lea\t{%E1, %0|%0, %E1}", xops);
> +    }
> +
> +  if (need_bnd_p)
> +    fputs ("\tbnd ret\n", asm_out_file);
> +  else
> +    fputs ("\tret\n", asm_out_file);
> +}
> +
> +/* Output a funtion with a call and return thunk for indirect branch.
> +   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
> +   function address is in REGNO.  Otherwise, the function address is
> +   on the top of stack.  */
> +
> +static void
> +output_indirect_thunk_function (bool need_bnd_p, int regno)
> +{
> +  char name[32];
> +  tree decl;
> +
> +  /* Create __x86_indirect_thunk/__x86_indirect_thunk_bnd.  */
> +  indirect_thunk_name (name, regno, need_bnd_p);
> +  decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
> +		     get_identifier (name),
> +		     build_function_type_list (void_type_node, NULL_TREE));
> +  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
> +				   NULL_TREE, void_type_node);
> +  TREE_PUBLIC (decl) = 1;
> +  TREE_STATIC (decl) = 1;
> +  DECL_IGNORED_P (decl) = 1;

DECL_ARTIFICIAL as well?

> +
> +#if TARGET_MACHO
> +  if (TARGET_MACHO)
> +    {
> +      switch_to_section (darwin_sections[picbase_thunk_section]);
> +      fputs ("\t.weak_definition\t", asm_out_file);
> +      assemble_name (asm_out_file, name);
> +      fputs ("\n\t.private_extern\t", asm_out_file);
> +      assemble_name (asm_out_file, name);
> +      putc ('\n', asm_out_file);
> +      ASM_OUTPUT_LABEL (asm_out_file, name);
> +      DECL_WEAK (decl) = 1;
> +    }
> +  else
> +#endif
> +    if (USE_HIDDEN_LINKONCE)
> +      {
> +	cgraph_node::create (decl)->set_comdat_group (DECL_ASSEMBLER_NAME (decl));
> +
> +	targetm.asm_out.unique_section (decl, 0);
> +	switch_to_section (get_named_section (decl, NULL, 0));
> +
> +	targetm.asm_out.globalize_label (asm_out_file, name);
> +	fputs ("\t.hidden\t", asm_out_file);
> +	assemble_name (asm_out_file, name);
> +	putc ('\n', asm_out_file);
> +	ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);
> +      }
> +    else
> +      {
> +	switch_to_section (text_section);
> +	ASM_OUTPUT_LABEL (asm_out_file, name);
> +      }
> +

Why do you need to output asm visibility directives by hand when you create
symbol for the function anyway?
I would expect that you can just use similar code as cgraph_node::expand_thunk
when calls output_mi_thunk and get this done in a way that is independent of
target assembler.
>  
> +/* Output indirect branch via a call and return thunk.  CALL_OP is
> +   the branch target.  XASM is the assembly template for CALL_OP.
> +   Branch is a tail call if SIBCALL_P is true.  */

Again please add example of code sequences output here to make code easier to follow.
> +
> +static void
> +ix86_output_indirect_branch (rtx call_op, const char *xasm,
> +			     bool sibcall_p)
> +{
> +  char thunk_name_buf[32];
> +  char *thunk_name;
> +  char push_buf[64];
> +  bool need_bnd_p = ix86_bnd_prefixed_insn_p (current_output_insn);
> +  int regno;
> +
> +  if (REG_P (call_op))
> +    regno = REGNO (call_op);
> +  else
> +    regno = -1;
> +
> +  if (cfun->machine->indirect_branch_type
> +      != indirect_branch_thunk_inline)
> +    {
> +      if (cfun->machine->indirect_branch_type == indirect_branch_thunk)
> +	{
> +	  if (regno >= 0)
> +	    {
> +	      int i = regno;
> +	      if (i >= FIRST_REX_INT_REG)
> +		i -= (FIRST_REX_INT_REG - LAST_INT_REG - 1);
> +	      if (need_bnd_p)
> +		indirect_thunks_bnd_used |= 1 << i;
> +	      else
> +		indirect_thunks_used |= 1 << i;
> +	    }
> +	  else
> +	    {
> +	      if (need_bnd_p)
> +		indirect_thunk_bnd_needed = true;
> +	      else
> +		indirect_thunk_needed = true;
> +	    }
> +	}
> +      indirect_thunk_name (thunk_name_buf, regno, need_bnd_p);
> +      thunk_name = thunk_name_buf;
> +    }
> +  else
> +    thunk_name = NULL;
> +
> +  snprintf (push_buf, sizeof (push_buf), "push{%c}\t%s",
> +	    TARGET_64BIT ? 'q' : 'l', xasm);
> +
> +  if (sibcall_p)
> +    {
> +      if (regno < 0)
> +	output_asm_insn (push_buf, &call_op);
> +      if (thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> +	}
> +      else
> +	output_indirect_thunk (need_bnd_p, regno);
> +    }
> +  else
> +    {
> +      if (regno >= 0 && thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd call\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
> +	  return;
> +	}
> +
> +      char indirectlabel1[32];
> +      char indirectlabel2[32];
> +
> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel1,
> +				   INDIRECT_LABEL,
> +				   indirectlabelno++);
> +      ASM_GENERATE_INTERNAL_LABEL (indirectlabel2,
> +				   INDIRECT_LABEL,
> +				   indirectlabelno++);
> +
> +      /* Jump.  */
> +      if (need_bnd_p)
> +	fputs ("\tbnd jmp\t", asm_out_file);
> +      else
> +	fputs ("\tjmp\t", asm_out_file);
> +      assemble_name_raw (asm_out_file, indirectlabel2);
> +      fputc ('\n', asm_out_file);
> +
> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
> +
> +      if (MEM_P (call_op))
> +	{
> +	  struct ix86_address parts;
> +	  rtx addr = XEXP (call_op, 0);
> +	  if (ix86_decompose_address (addr, &parts)
> +	      && parts.base == stack_pointer_rtx)
> +	    {
> +	      /* Since call will adjust stack by -UNITS_PER_WORD,
> +		 we must convert "disp(stack, index, scale)" to
> +		 "disp+UNITS_PER_WORD(stack, index, scale)".  */
> +	      if (parts.index)
> +		{
> +		  addr = gen_rtx_MULT (Pmode, parts.index,
> +				       GEN_INT (parts.scale));
> +		  addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
> +				       addr);
> +		}
> +	      else
> +		addr = stack_pointer_rtx;
> +
> +	      rtx disp;
> +	      if (parts.disp != NULL_RTX)
> +		disp = plus_constant (Pmode, parts.disp,
> +				      UNITS_PER_WORD);
> +	      else
> +		disp = GEN_INT (UNITS_PER_WORD);
> +
> +	      addr = gen_rtx_PLUS (Pmode, addr, disp);
> +	      call_op = gen_rtx_MEM (GET_MODE (call_op), addr);
> +	    }
> +	}
> +
> +      if (regno < 0)
> +	output_asm_insn (push_buf, &call_op);
> +
> +      if (thunk_name != NULL)
> +	{
> +	  if (need_bnd_p)
> +	    fprintf (asm_out_file, "\tbnd jmp\t%s\n", thunk_name);
> +	  else
> +	    fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> +	}
> +      else
> +	output_indirect_thunk (need_bnd_p, regno);
> +
> +      ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
> +
> +      /* Call.  */
> +      if (need_bnd_p)
> +	fputs ("\tbnd call\t", asm_out_file);
> +      else
> +	fputs ("\tcall\t", asm_out_file);
> +      assemble_name_raw (asm_out_file, indirectlabel1);
> +      fputc ('\n', asm_out_file);
> +    }
> +}

It is really not very pretty that the whole sequence is injected into insn stream
as a single blob.  How opaque it is? Does it need to be patched at dynamic link time?

I suppose it may make sense to split the insn and at least explicitly represent
the fact that we (sometimes) push the target to stack.  
Why the memory variant exists at first place? 
>  
>  (define_insn "*indirect_jump"
>    [(set (pc) (match_operand:W 0 "indirect_branch_operand" "rBw"))]
>    ""
> -  "%!jmp\t%A0"
> +  "* return ix86_output_indirect_jmp (operands[0], false);"
>    [(set_attr "type" "ibr")
>     (set_attr "length_immediate" "0")
>     (set_attr "maybe_prefix_bnd" "1")])

I think you also want to update type to "many" when we do more than just indirect branch.
We do not care much about this, but it feels wrong to have attributes off reality.
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index f3e4a63ab46..ddb6035be96 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -5754,6 +5754,16 @@ Specify which floating-point unit to use.  You must specify the
>  @code{target("fpmath=sse+387")} because the comma would separate
>  different options.
>  
> +@item indirect_branch("@var{choice}")
> +@cindex @code{indirect_branch} function attribute, x86
> +On x86 targets, the @code{indirect_branch} attribute causes the compiler
> +to convert indirect call and jump with @var{choice}.  @samp{keep}
> +keeps indirect call and jump unmodified.  @samp{thunk} converts indirect
> +call and jump to call and return thunk.  @samp{thunk-inline} converts
> +indirect call and jump to inlined call and return thunk.
> +@samp{thunk-extern} converts indirect call and jump to external call
> +and return thunk provided in a separate object file.

Please expand the documentation in a way that random user who is not aware of the
issue will understand that those are security features that come at a cost.

> +@opindex -mindirect-branch
> +Convert indirect call and jump with @var{choice}.  The default is
> +@samp{keep}, which keeps indirect call and jump unmodified.
> +@samp{thunk} converts indirect call and jump to call and return thunk.
> +@samp{thunk-inline} converts indirect call and jump to inlined call
> +and return thunk.  @samp{thunk-extern} converts indirect call and jump
> +to external call and return thunk provided in a separate object file.
> +You can control this behavior for a specific function by using the
> +function attribute @code{indirect_branch}.  @xref{Function Attributes}.

Similarly here.

Rest of the patch seems OK. We may want incrementally represent more of the
indirect jump/call seqeuence in RTL, but at this point probably keeping things
simple and localized is not a bad idea. This can be done incrementally.

Please make updated patch and I would like to give others chance to comment today.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]