This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH V2 2/8] bpf: new GCC port

From: Jeff Law <law at redhat dot com>
To: "Jose E. Marchesi" <jemarch at gnu dot org>, "Jose E. Marchesi" <jose dot marchesi at oracle dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Tue, 20 Aug 2019 12:49:14 -0600
Subject: Re: [PATCH V2 2/8] bpf: new GCC port
References: <20190817005056.15749-1-jose.marchesi@oracle.com> <87d0h4y5c1.fsf@gnu.org>
On 8/16/19 6:59 PM, Jose E. Marchesi wrote:
> This patch adds a port for the Linux kernel eBPF architecture to GCC.
> 
> ChangeLog:
> 
>   * configure.ac: Support for bpf-*-* targets.
>   * configure: Regenerate.
> 
> contrib/ChangeLog:
> 
>   * config-list.mk (LIST): Disable go in bpf-*-* targets.
> 
> gcc/ChangeLog:
> 
>   * config.gcc: Support for bpf-*-* targets.
>   * common/config/bpf/bpf-common.c: New file.
>   * config/bpf/t-bpf: Likewise.
>   * config/bpf/predicates.md: Likewise.
>   * config/bpf/constraints.md: Likewise.
>   * config/bpf/bpf.opt: Likewise.
>   * config/bpf/bpf.md: Likewise.
>   * config/bpf/bpf.h: Likewise.
>   * config/bpf/bpf.c: Likewise.
>   * config/bpf/bpf-protos.h: Likewise.
>   * config/bpf/bpf-opts.h: Likewise.
>   * config/bpf/bpf-helpers.h: Likewise.
>   * config/bpf/bpf-helpers.def: Likewise.
So I think various folks have already mentioned the configure rebuild
issues, formatting and other stuff.  I'm going to try to keep them all
in mind so that I don't duplicate anything.  If I do duplicate someone's
comment, apologies in advance.

At a high level I realize there's lots of things not supported due to
the restricted environment it'll ultimately be used in.  However, you
might want to consider extensions that would allow larger portions of
the gcc testsuite to run and some kind of user mode simulator so that
you can reasonably test the target.  Not a requirement, but could be
useful (from experience :-)


> ---

> diff --git a/contrib/config-list.mk b/contrib/config-list.mk
> index 69c826e649a..aa9fdb64eaf 100644
> --- a/contrib/config-list.mk
> +++ b/contrib/config-list.mk
> @@ -123,7 +123,7 @@ $(LIST): make-log-dir
>  		TGT=`echo $@ | awk 'BEGIN { FS = "OPT" }; { print $$1 }'` &&			\
>  		TGT=`$(GCC_SRC_DIR)/config.sub $$TGT` &&					\
>  		case $$TGT in									\
> -			*-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*)			\
> +			*-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix* | bpf-*-* )			\
>  				ADDITIONAL_LANGUAGES="";					\
>  				;;								\
>  			*)									\
So I've got no problem disabling Go for BFD, but I don't see bpf added
to LIST, which it should be.


> diff --git a/gcc/common/config/bpf/bpf-common.c b/gcc/common/config/bpf/bpf-common.c
> new file mode 100644
> index 00000000000..a68feb62897
> --- /dev/null
> +++ b/gcc/common/config/bpf/bpf-common.c
[ snip ]
> +/* Implement TARGET_OPTION_OPTIMIZATION_TABLE.  */
> +static const struct default_options bpf_option_optimization_table[] =
> +  {
> +    /* Enable -funroll-all-loops by default.  */
> +    { OPT_LEVELS_ALL, OPT_funroll_all_loops, NULL, 1 },
> +    /* Disable -fomit-frame-pointer by default.  */
> +    { OPT_LEVELS_ALL, OPT_fomit_frame_pointer, NULL, 0 },
> +    { OPT_LEVELS_NONE, 0, NULL, 0 }
> +  };
Curious about the motivation on the loop unrolling stuff.  In general we
discourage targets from mucking around with the default
flags/optimizations, but it is sometimes the right thing to do.

Rather than -fomit-frame-pointer, I think you can use the
FRAME_POINTER_REQUIRED hook if you always want a frame pointer.


> +
> +#undef TARGET_OPTION_OPTIMIZATION_TABLE
> +#define TARGET_OPTION_OPTIMIZATION_TABLE bpf_option_optimization_table
> +
> +/* Implement TARGET_OPTION_DEFAULT_PARAMS.  */
> +
> +static void
> +bpf_option_default_params (void)
> +{
> +  /* XXX large-stack-frame = 512 bytes */
> +  /* XXX max-unrolled-insns */
> +  /* XXX max-unroll-times */
> +}
> +
> +#undef TARGET_OPTION_DEFAULT_PARAMS
> +#define TARGET_OPTION_DEFAULT_PARAMS bpf_option_default_params
I'd generally discourage twiddling the params like this, at least the
ones for the unroller.



> diff --git a/gcc/config/bpf/bpf-helpers.h b/gcc/config/bpf/bpf-helpers.h
> new file mode 100644
> index 00000000000..2fe96be7637
> --- /dev/null
> +++ b/gcc/config/bpf/bpf-helpers.h
I can't remember, is this an installed header that consumers are
expected to use?  If so you might want to be careful with polluting user
code with BPF #defines such as BPF_ANY, BPF_NOEXIST, BPF_EXIST, etc.
The #defines for mapping to the builtins are probably OK though.




> diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
> new file mode 100644
> index 00000000000..4a42259a9c3
> --- /dev/null
> +++ b/gcc/config/bpf/bpf.c
> @@ -0,0 +1,1136 @@
[ ... ]
> +
> +/* Return the builtin code corresponding to the kernel helper builtin
> +   __builtin_NAME, or 0 if the name doesn't correspond to a kernel
> +   helper builtin.  */
> +
> +static inline int
> +bpf_helper_code (const char *name)
> +{
> +  int i;
> +
> +  for (i = 1; i < BPF_BUILTIN_HELPER_MAX; ++i)
> +    {
> +      if (strcmp (name, bpf_helper_names[i]) == 0)
> +	return i;
> +    }
> +
> +  return 0;
> +}
Does this get called often?  If so the linear search could end up being
expensive from a compile-time standpoint.



> +#define KERNEL_VERSION_CODE "__BPF_KERNEL_VERSION_CODE__="    
> +    kernel_version_code
> +      = (char *) alloca (strlen (KERNEL_VERSION_CODE) + 7 + 1);
> +    strcpy (kernel_version_code, KERNEL_VERSION_CODE);
> +#undef KERNEL_VERSION_CODE
> +    strcat (kernel_version_code, version_code);
> +    builtin_define (kernel_version_code);
> +  }
> +}
Does builtin_define copy its argument?  If not, then I'd expect this to
be problematical as the alloca'd space will be reclaimed.


> +static rtx
> +bpf_function_value (const_tree ret_type,
> +		    const_tree fntype_or_decl ATTRIBUTE_UNUSED,
> +		    bool outgoing ATTRIBUTE_UNUSED)
> +{
> +  enum machine_mode mode;
> +  int unsignedp;
> +
> +  mode = TYPE_MODE (ret_type);
> +  if (INTEGRAL_TYPE_P (ret_type))
> +    mode = promote_function_mode (ret_type, mode, &unsignedp, fntype_or_decl, 1);
> +
> +  return gen_rtx_REG (mode, 0);
Rather than using "0" for the register number, consider using its name
from bpf.h.

> +}
> +
> +#undef TARGET_FUNCTION_VALUE
> +#define TARGET_FUNCTION_VALUE bpf_function_value
> +
> +/* Return true if REGNO is th enumber of a hard register in which the
> +   values of called function may come back.  */
> +
> +static bool
> +bpf_function_value_regno_p (const unsigned int regno)
> +{
> +  return (regno == 0);
> +}
Similarly.

> +
> +static void
> +bpf_compute_frame (void)
> +{
> +  int stack_alignment = STACK_BOUNDARY / BITS_PER_UNIT;
> +  int padding_locals, regno;
> +
> +  /* Set the space used in the stack by local variables.  This is
> +     rounded up to respect the minimum stack alignment.  */
> +  cfun->machine->local_vars_size = get_frame_size ();
> +
> +  padding_locals = cfun->machine->local_vars_size % stack_alignment;
> +  if (padding_locals)
> +    padding_locals = stack_alignment - padding_locals;
> +
> +  cfun->machine->local_vars_size += padding_locals;
> +
> +  /* Set the space used in the stack by callee-saved used registers in
> +     the current function.  There is no need to round up, since the
> +     registers are all 8 bytes wide.  */
> +  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> +    if ((!fixed_regs[regno]
> +	 && df_regs_ever_live_p (regno)
> +	 && !call_used_regs[regno])
> +	|| (cfun->calls_alloca
> +	    && regno == STACK_POINTER_REGNUM))
> +      cfun->machine->callee_saved_reg_size += 8;
> +
> +  /* Check that the total size of the frame doesn't exceed the limit
> +     imposed by eBPF: currently 512 bytes.  */
> +  if ((cfun->machine->local_vars_size
> +       + cfun->machine->callee_saved_reg_size) > 512)
> +    {
> +      static int stack_limit_exceeded = 0;
> +
> +      if (!stack_limit_exceeded)
> +	error ("eBPF stack limit of 512 bytes exceeded");
> +      stack_limit_exceeded = 1;
> +    }
Is the stack limit likely to change?  Would a param work better here
which would allow us to accommodate such a change without having to
re-release GCC?



> +
> +/* Expand to the instructions in a function epilogue.  This function
> +   is called when expanding the 'prologue' pattern in bpf.md.  */
> +
> +void
> +bpf_expand_epilogue (void)
> +{
> +  int regno, fp_offset;
> +  rtx insn;
> +
> +  bpf_compute_frame ();
> +  fp_offset = -cfun->machine->local_vars_size;
> +
> +  /* Restore callee-saved hard registes from the stack.  */
> +  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> +    {
> +      if ((!fixed_regs[regno]
> +	   && df_regs_ever_live_p (regno)
> +	   && !call_used_regs[regno])
> +	  || (cfun->calls_alloca
> +	      && regno == STACK_POINTER_REGNUM))
> +	{
> +	  rtx mem;
> +
> +	  if (!IN_RANGE (fp_offset, -1 - 0x7fff, 0x7fff))
> +	    /* This has been already reported as an error in
> +	       bpf_compute_frame. */
> +	    break;
> +	  else
> +	    {
> +	      mem = gen_frame_mem (DImode,
> +				   plus_constant (DImode,
> +						  gen_rtx_REG (DImode, FRAME_POINTER_REGNUM),
> +						  fp_offset - 8));
> +	      insn = emit_move_insn (gen_rtx_REG (DImode, regno), mem);
> +	      RTX_FRAME_RELATED_P (insn) = 1;
> +	      fp_offset -= 8;
> +	    }
> +	}
> +    }
> +
> +  emit_jump_insn (gen_exit ());
So ebpf doesn't need to do instruction scheduling, but even so it's
probably safest to emit a scheduling barrier before cutting back the stack.



> +HOST_WIDE_INT
> +bpf_initial_elimination_offset (int from,
> +				int to)
> +{
> +  HOST_WIDE_INT ret;
> +
> +  if (from == ARG_POINTER_REGNUM && to == STACK_POINTER_REGNUM)
> +    {
> +      bpf_compute_frame ();
> +      ret = (cfun->machine->local_vars_size
> +	     + cfun->machine->callee_saved_reg_size);
> +    }
> +  else if (from == ARG_POINTER_REGNUM && to == FRAME_POINTER_REGNUM)
> +    ret = 0;
> +  else
> +    abort ();
Rather than abort() these days gcc_unreachable () is preferred.

I'm going to skip over all the function argument passing stuff as that
all needs updating after Richard S's changes.

> diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
> +
> +/**** Debugging Info ****/
> +
> +/* We cannot support DWARF2 because of the limitations of eBPF.  */
> +#define DBX_DEBUGGING_INFO
Umm, we're trying to get rid of DBX_DEBUGGING_INFO.  I'd rather not add
another user at this point.  How tough would it be to support dwarf?

> +
> +/* Define how to find the value returned by a library function
> +   assuming the value has mode MODE.  This is always %r0 for eBPF.  */
> +#define LIBCALL_VALUE(MODE)  \
> +  gen_rtx_REG ((MODE), 0)
Consider using BPF_R0

> +
> +/* The maximum number of bytes that a signle instruction can move
s/signle/single/



> +
> +(define_insn "*mulsi3_extended"
> +  [(set (match_operand:DI	   0 "register_operand" "=r,r")
> +        (sign_extend:DI
> +         (mult:SI (match_operand:SI 1 "register_operand" "0,0")
> +                  (match_operand:SI 2 "reg_or_imm_operand" "r,I"))))]
> +  ""
> +  "mul32\t%0,%2"
> +  [(set_attr "type" "alu32")])
mulsidi3?  or umulsidi3 for the name?

I believe you already commented on the need to address the div vs udiv
problem.  Similarly for mod vs umod.

> +;;;; Data movement
> +
> +(define_mode_iterator AMM [QI HI SI DI SF DF])
> +
> +(define_expand "mov<AMM:mode>"
> +  [(set (match_operand:AMM 0 "general_operand" "")
> +        (match_operand:AMM 1 "general_operand" ""))]
> +        ""
> +        "
> +{
> +    if (!register_operand(operands[0], <AMM:MODE>mode)
> +        && !register_operand(operands[1], <AMM:MODE>mode))
> +         operands[1] = force_reg (<AMM:MODE>mode, operands[1]); 
> +
> +    /* In cases where the moved entity is a constant address, we
> +       need to emit an extra mov and modify the second operand to
> +       obtain something like:
> +
> +         lddw %T, %1
> +         ldxw %0, [%T+0]
> +
> +       Ditto for stores.  */
> +
> +    if (MEM_P (operands[1])
> +        && CONSTANT_ADDRESS_P (XEXP (operands[1], 0)))
> +      {
> +         rtx tmp = gen_reg_rtx (DImode);
> +
> +         emit_move_insn (tmp, XEXP (operands[1], 0));
> +         operands[1] = gen_rtx_MEM (<AMM:MODE>mode, tmp);
> +      }
> +
> +    if (MEM_P (operands[0])
> +        && CONSTANT_ADDRESS_P (XEXP (operands[0], 0)))
> +      {
> +         rtx tmp = gen_reg_rtx (DImode);
> +  
> +         emit_move_insn (tmp, XEXP (operands[0], 0));
> +         operands[0] = gen_rtx_MEM (<AMM:MODE>mode, tmp);
> +      }
> +
> +}")
Hmm, what happens if you need to reload something from a constant
address?  You can't call gen_reg_rtx once register allocation has
started.  THe case where you need a scratch register really feels like
you need to be defining secondary reloads.


Generally it looks pretty good.  I'd like to take more more looksie at
patch #2 of the series after you've addressed the comments you've
received so far.

jeff
Follow-Ups:
- Re: [PATCH V2 2/8] bpf: new GCC port
  - From: Jose E. Marchesi
References:
- [PATCH V2 0/8] eBPF support for GCC
  - From: Jose E. Marchesi
- [PATCH V2 2/8] bpf: new GCC port
  - From: Jose E. Marchesi
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]