This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/6] Andes nds32: machine description of nds32 porting (1).


Chung-Ju Wu <jasonwucj@gmail.com> writes:
> It has been a while since last v2 patch.
> I create a new v3 patch to fix some typo and indentation.

I had a read through out of curiosity, and FWIW, it looks very clean and
well-commented to me.  See below for a few questions and comments.
This isn't an official review though.

> +  /* Calculate the bytes of saving callee-saved registers on stack.  */
> +  cfun->machine->callee_saved_regs_size = 0;
> +  cfun->machine->callee_saved_regs_first_regno = SP_REGNUM;
> +  cfun->machine->callee_saved_regs_last_regno  = SP_REGNUM;
> +  /* Currently, there is no need to check $r28~$r31
> +     because we will save them in another way.  */
> +  for (r = 0; r < 28; r++)
> +    {
> +      if (NDS32_REQUIRED_CALLEE_SAVED_P (r))
> +	{
> +	  /* Each register is 4 bytes.  */
> +	  cfun->machine->callee_saved_regs_size += 4;
> +
> +	  /* Mark the first required callee-saved register
> +	     (only need to set it once).
> +	     If first regno == SP_REGNUM, we can tell that
> +	     it is the first time to be here.  */
> +	  if (cfun->machine->callee_saved_regs_first_regno == SP_REGNUM)
> +	    cfun->machine->callee_saved_regs_first_regno = r;
> +	  /* Mark the last required callee-saved register.  */
> +	  cfun->machine->callee_saved_regs_last_regno = r;
> +	}
> +    }
> +
> +  /* Note:
> +     Since our smw/lmw instructions use Rb and Re
> +     to store/load registers consecutively,
> +     we need to check again to see if there is any register
> +     which is not live in the function but in the range
> +     between 'callee_saved_regs_first_regno' and 'callee_save_regs_last_regno'.
> +     If we find that registers, save their register size
> +     in the 'callee_saved_regs_size' for prologue and epilogue.
> +
> +     For example:
> +     Assume that the registers $r6, $r7, $r8, $r10, and $r11 are live
> +     in the function, so initially callee_saved_regs_size = 4 * 5 = 20.
> +     However, although $r9 is not live in the function,
> +     it is in the range between $r6 and $r11.
> +     We have to increase callee_saved_regs_size so that
> +     prologue and epilogue can use it to issue
> +     smw/lmw instructions and adjust the offset correctly.  */
> +  for (r =  cfun->machine->callee_saved_regs_first_regno;
> +       r <= cfun->machine->callee_saved_regs_last_regno;
> +       r++)
> +    {
> +      if (!df_regs_ever_live_p (r)
> +	  && r > cfun->machine->callee_saved_regs_first_regno
> +	  && r < cfun->machine->callee_saved_regs_last_regno)
> +	{
> +	  /* Found one register which is not live in the function
> +	     but in the range between first_regno and last_regno.  */
> +	  cfun->machine->callee_saved_regs_size += 4;
> +	}
> +    }

I wasn't sure here why you were setting callee_saved_regs_size in the
first loop and then had the second loop to handle the others.  Couldn't
this just be done by moving:

> +  /* Adjustment for v3push instructions:
> +     If we are using v3push (push25/pop25) instructions,
> +     we need to make sure Rb is $r6 and Re is
> +     located on $r6, $r8, $r10, or $r14.
> +     Some results above will be discarded and recomputed.
> +     Note that it is only available under V3/V3M ISA.  */
> +  if (TARGET_V3PUSH)
> +    {
> +      /* Recompute:
> +           cfun->machine->fp_size
> +           cfun->machine->gp_size
> +           cfun->machine->lp_size
> +           cfun->machine->callee_saved_regs_first_regno
> +           cfun->machine->callee_saved_regs_last_regno
> +           cfun->machine->callee_saved_regs_size */
> +
> +      /* For v3push instructions, $fp, $gp, and $lp are always saved.  */
> +      cfun->machine->fp_size = 4;
> +      cfun->machine->gp_size = 4;
> +      cfun->machine->lp_size = 4;
> +
> +      /* Remember to set Rb = $r6.  */
> +      cfun->machine->callee_saved_regs_first_regno = 6;
> +
> +      if (cfun->machine->callee_saved_regs_last_regno <= 6)
> +	{
> +	  /* Re = $r6 */
> +	  cfun->machine->callee_saved_regs_last_regno = 6;
> +	}
> +      else if (cfun->machine->callee_saved_regs_last_regno <= 8)
> +	{
> +	  /* Re = $r8 */
> +	  cfun->machine->callee_saved_regs_last_regno = 8;
> +	}
> +      else if (cfun->machine->callee_saved_regs_last_regno <= 10)
> +	{
> +	  /* Re = $r10 */
> +	  cfun->machine->callee_saved_regs_last_regno = 10;
> +	}
> +      else if (cfun->machine->callee_saved_regs_last_regno <= 14)
> +	{
> +	  /* Re = $r14 */
> +	  cfun->machine->callee_saved_regs_last_regno = 14;
> +	}
> +      else if (cfun->machine->callee_saved_regs_last_regno == SP_REGNUM)
> +	{
> +	  /* If last_regno is SP_REGNUM, which means
> +	     it is never changed, so set it to Re = $r6.  */
> +	  cfun->machine->callee_saved_regs_last_regno = 6;
> +	}
> +      else
> +	{
> +	  /* The program flow should not go here.  */
> +	  gcc_unreachable ();
> +	}
> +
> +      /* Compute pushed size of callee-saved registers.  */
> +      cfun->machine->callee_saved_regs_size
> +	= 4 * (cfun->machine->callee_saved_regs_last_regno
> +	       - cfun->machine->callee_saved_regs_first_regno
> +	       + 1);

...this last statement out of the "if" block and making it
conditional on "cfun->machine->callee_saved_regs_last_regno != SP_REGNUM"?

> +/* Function to create a parallel rtx pattern
> +   which presents stack push multiple behavior.
> +   The overall concept are:
> +     "unspec_stack_push_multiple",
> +     "use/clobber stack pointer",
> +     "use Rb to Re", and "use $fp, $gp, $lp".  */
> +static rtx
> +nds32_gen_stack_push_multiple (rtx Rb, rtx Re, rtx En4)
> +{
> +  int regno;
> +  int extra_count;
> +  int num_use_regs;
> +  int par_index;
> +
> +  rtx reg;
> +  rtx parallel_insn;
> +
> +  /* We need to provide a customized rtx which contains
> +     necessary information for data analysis,
> +     so we create a parallel rtx like this:
> +     (parallel [(unspec [(reg: Rb)
> +                         (reg: Re)
> +                         (const_int En4)] UNSPEC_STACK_PUSH_MULTIPLE)
> +                (use (reg:SI SP_REGNUM))
> +                (clobber (reg:SI SP_REGNUM))
> +                (use (reg:SI Rb))
> +                (use (reg:SI Rb+1))
> +                ...
> +                (use (reg:SI Re))
> +                (use (reg:SI FP_REGNUM))
> +                (use (reg:SI GP_REGNUM))
> +                (use (reg:SI LP_REGNUM))]) */

I know it's a pain to implement, but the pattern should really model the
memory too, like you do for the call-frame information.  I.e.:

  (parallel [(set (mem:SI ...)
                  (reg:SI Rb))
             (set (mem:SI ...)
                  (reg:SI Rb+1))
             ...])

Same idea for pop.  (It might be that you don't need separate call-frame
info after that change.)

> +/* Function to construct isr vectors information array.
> +   We need to check:
> +     1. Traverse interrupt/exception/reset for settig vector id.
> +     2. Only 'nested', 'not_nested', or 'nested_ready' in the attributes.
> +     3. Only 'save_all' or 'partial_save' in the attributes.  */
> +static void
> +nds32_construct_isr_vectors_information (tree func_attrs,
> +					 const char *func_name)
> +{
> +  int nested_p, not_nested_p, nested_ready_p;
> +  int save_all_p, partial_save_p;
> +  tree intr, excp, reset;
> +  int temp_count;
> +
> +  nested_p = not_nested_p = nested_ready_p = 0;
> +  save_all_p = partial_save_p = 0;
> +  temp_count = 0;
> +
> +  /* We must check at MOST one attribute to set save-reg.  */
> +  if (lookup_attribute ("save_all", func_attrs))
> +    save_all_p = 1;
> +  if (lookup_attribute ("partial_save", func_attrs))
> +    partial_save_p = 1;
> +
> +  if ((save_all_p + partial_save_p) > 1)
> +    error ("multiple save reg attributes to function %qs", func_name);

I think this kind of error is really supposed to be reported in
TARGET_MERGE_DECL_ATTRIBUTES, where it can include full location
information.  Same for the other checks.

> +	  else
> +	    {
> +	      /* Issue error if it is not a valid integer value.  */
> +	      error ("invalid id value for interrupt/exception attribute");
> +	    }

...and I think this kind of argument checking is supposed to go in
TARGET_INSERT_ATTRIBUTES.

> +      /* Create one more instruction to move value
> +         into the temporary register.  */
> +      value_move_insn = gen_movsi (tmp_reg, GEN_INT (full_value));
> +      /* Emit rtx into insn list and receive its transformed insn rtx.  */
> +      value_move_insn = emit_insn (value_move_insn);

This can be simplified to:

  value_move_insn = emit_move_insn (tmp_reg, GEN_INT (full_value));

which is more usual than calling gen_mov* directly.

> +/* Return true if MODE/TYPE need double word alignment.  */
> +static bool
> +nds32_needs_double_word_align (enum machine_mode mode, const_tree type)
> +{
> +  return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
> +	  || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
> +}

For something ABI-related like this, it's usually better not to look at
"mode" when "type" is nonnull, since the type directly reflects the
source language whereas "mode" is more of an internal detail.  E.g..
something like:

  (type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode)) > PARM_BOUNDARY

> +/* Function that check if 'INDEX' is valid to be a index rtx for address.
> +
> +   OUTER_MODE : Machine mode of outer address rtx.
> +   OUTER_CODE : rtx code of outer address rtx.
> +        INDEX : Check if this rtx is valid to be a index for address.
> +         BASE : The rtx of base register.
> +       STRICT : If it is true, we are in reload pass or after reload pass.  */
> +static bool
> +nds32_legitimate_index_p (enum machine_mode outer_mode,
> +			  enum rtx_code outer_code ATTRIBUTE_UNUSED,
> +			  rtx index,
> +			  rtx base ATTRIBUTE_UNUSED,
> +			  bool strict)

Minor, but since this is an internal function, it'd probably be better
to remove the unused parameters.  Also.

> +	  /* Further check if the value is legal for the 'outer_mode'.  */
> +	  if (!satisfies_constraint_Is15 (index))
> +	    goto non_legitimate_index;
...
> +non_legitimate_index:
> +  return false;

...better to just use "return false" here.  (I'm guessing this code
dates back to the old GO_IF_LEGITIMATE_ADDRESS days.)

> +/* Function to expand builtin function for
> +   '[(unspec [(reg)])]'.  */

Plain unspecs are pretty dangerous, since there's nothing really
to anchor them to a particular position.  I've not yet looked through
the .md file to see what the patterns actually look like though.

> +  /* Determine the modes of the instruction operands.
> +     Note that we don't have left-hand-side result,
> +     so operands[0] IS FOR arg0.  */
> +  mode0 = insn_data[icode].operand[0].mode;
> +
> +  /* Refer to nds32.intrinsic.md,
> +     we want to ensure operands[0] is register_operand.  */
> +  if (!((*insn_data[icode].operand[0].predicate) (op0, mode0)))
> +    op0 = copy_to_mode_reg (mode0, op0);
> +
> +  /* Emit new instruction and return original target.  */
> +  pat = GEN_FCN (icode) (op0);
> +
> +  if (!pat)
> +    return 0;
> +
> +  emit_insn (pat);
> +
> +  return target;

Please use the create_*_operand/maybe_expand_insn routines from optabs.h
for this kind of thing.

> +/* Permitting tail calls.  */
> +
> +static bool
> +nds32_warn_func_return (tree decl)
> +{
> +  /* Naked functions are implemented entirely in assembly, including the
> +     return sequence, so suppress warnings about this.  */
> +  return !nds32_naked_function_p (decl);
> +}

The comment above the function doesn't seem to match the function.

> +  /* Instruction-cache sync instruction,
> +     requesting an arugment as staring address.  */
> +  rtx isync_insn;

Typo: "argument as starting address".

> +      /* Now check [reg], [symbol_ref], and [const].  */
> +      if (GET_CODE (x) != REG
> +	  && GET_CODE (x) != SYMBOL_REF
> +	  && GET_CODE (x) != CONST)
> +	goto non_legitimate_address;
> [...]
> +non_legitimate_address:
> +  return false;

As above, no need for the gotos.  Just return false directly.

> +    case PLUS:
> +      /* virtual_stack_vars_rtx will eventually transfer to SP or FP
> +         force [virtual_stack_vars + reg or const] to register first,
> +         make the offset_address combination to
> +         other addressing mode possible.  */
> +      if (mode == BLKmode
> +	  && REG_P (XEXP (x, 0))
> +	  && (REGNO (XEXP (x, 0)) == VIRTUAL_STACK_VARS_REGNUM))
> +	goto non_legitimate_address;

I don't understand this, sorry.  Is it an optimisation, or needed for
correctness?

> +static rtx
> +nds32_legitimize_address (rtx x,
> +			  rtx oldx ATTRIBUTE_UNUSED,
> +			  enum machine_mode mode ATTRIBUTE_UNUSED)
> +{
> +  /* 'mode' is the machine mode of memory operand
> +     that uses 'x' as address.  */
> +
> +  return x;
> +}

There's no need to define a dummy function here.  Just leave the default
TARGET_LEGITIMIZE_ADDRESS in place.

> +  /* If fp$, $gp, $lp, and all callee-save registers are NOT required

Typo: $fp.  Another instance later.

> +  /* If the function is 'naked', we do not have to generate
> +     epilogue code fragment BUT 'ret' instruction.  */
> +  if (cfun->machine->naked_p)
> +    {
> +      /* Generate return instruction by using unspec_func_return pattern.
> +         Make sure this instruction is after gen_blockage().
> +         NOTE that $lp will become 'live'
> +         after this instruction has been emitted.  */
> +      emit_insn (gen_unspec_func_return ());
> +      return;
> +    }

FWIW, this is different from AVR (for one), which explicitly doesn't emit a
return instruction.  One of the uses for AVR is to chain code together
into an .init-like section.

> +  if (frame_pointer_needed)
> +    {
> +      /* adjust $sp = $fp - ($fp size) - ($gp size) - ($lp size)
> +       *                  - (4 * callee-saved-registers)
> +       *
> +       * Note: No need to adjust
> +       *       cfun->machine->callee_saved_area_padding_bytes,
> +       *       because we want to adjust stack pointer
> +       *       to the position for pop instruction.
> +       */

There are a few comments that have this style, but GNU layout is the:

      /* ~~~~~~~~~~~~~~~
         ~~~~~~~~~~~~~~~.  */

style you use in most places.

> +  /* Caculate the number of 1-bit of (~ival), if there is only one 1-bit,

Typo: Calculate

> +     it means the original ival has only one 0-bit,
> +     So it is ok to perform 'bclr' operation.  */
> +
> +  one_bit_count = __builtin_popcount ((unsigned int) (~ival));
> +
> +  /* 'bclr' is a performance extension instruction.  */
> +  return (TARGET_PERF_EXT && (one_bit_count == 1));

We shouldn't really call __builtin_popcount directly.  You could just
use popcount_hwi, or the (x & -x) == x thing.

Redundant brackets around the ==.  There are a few other cases in the
patch too, although most of it seems OK.

> +      switch (mode_test)
> +	{
> +	case QImode:
> +	  /* 333 format.  */
> +	  if (val >=0 && val < 8 && regno < 8)
> +	    return ADDRESS_LO_REG_IMM3U;
> +	  break;
> +
> +	case HImode:
> +	  /* 333 format.  */
> +	  if (val >=0 && val < 16 && (val % 2 == 0) && regno < 8)
> +	    return ADDRESS_LO_REG_IMM3U;

Missing space before "0".  A few other cases in the same function.

> +   The trampoline code for nds32 target must contains folloing parts:

Typo: following.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]