[PATCH 1/3] Power10: Add PCREL_OPT load support

Segher Boessenkool segher@kernel.crashing.org
Fri Aug 21 02:09:40 GMT 2020


Hi!

On Tue, Aug 18, 2020 at 02:34:01AM -0400, Michael Meissner wrote:
> +// Maximum number of insns to scan between the load address and the load that

Please don't mix /* and // comments.  Just stick to /* comments, like
all the rest of our backend?

> +const int MAX_PCREL_OPT_INSNS	= 10;

"static const", and not tab please (just a space).

> +  // LWA is a DS format instruction, but LWZ is a D format instruction.  We use
> +  // DImode for the mode to force checking whether the bottom 2 bits are 0.

How so?  Can't you just check if the resulting insn is accepted? That
would be so much more robust (and can have a correct description more
easily, too!)

> +  // However FPR and vector registers uses the LFIWAX instruction which is
> +  // indexed only.

(vectors use lxsiwax)

> +  if (GET_CODE (mem) == SIGN_EXTEND && GET_MODE (XEXP (mem, 0)) == SImode)

You're checking here whether the address of the MEM is SImode.

> +    {
> +      if (!INT_REGNO_P (reg_regno))

That should use base_reg_operand instead?

> +  // The optimization will only work on non-prefixed offsettable loads.
> +  rtx addr = XEXP (mem_inner, 0);
> +  enum insn_form iform = address_to_insn_form (addr, mem_mode, non_prefixed);
> +  if (iform != INSN_FORM_BASE_REG
> +      && iform != INSN_FORM_D
> +      && iform != INSN_FORM_DS
> +      && iform != INSN_FORM_DQ)
> +    return false;

Sounds like something for a utility function?  Do we use this elsewhere?

> +  ++pcrel_opt_next_num;

Normal style is postfix increment.  Empty lines around this are nice, it
indicates this is the point where we decided to do the thing.  Or add a
comment even, maybe?

> +  unsigned int addr_regno = reg_or_subregno (addr_reg);
> +  rtx label_num = GEN_INT (pcrel_opt_next_num);
> +  rtx reg_di = gen_rtx_REG (DImode, reg_regno);
> +
> +  PATTERN (addr_insn)
> +    = ((addr_regno != reg_regno)
> +       ? gen_pcrel_opt_ld_addr (addr_reg, addr_symbol, label_num, reg_di)
> +       : gen_pcrel_opt_ld_addr_same_reg (addr_reg, addr_symbol, label_num));

Please use if() instead of multi-line expressions.  The outer parens are
unnecessary, too.

> +  // Revalidate the insn, backing out of the optimization if the insn is not
> +  // supported.

So the count will be incorrect then?

> +  INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0);
> +  if (INSN_CODE (addr_insn) < 0)
> +    {
> +      PATTERN (addr_insn) = addr_set;
> +      INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0);
> +      return false;
> +    }

Can you use the normal undo mechanisms, instead?  apply_change_group
etc.

> +static rtx_insn *
> +next_active_insn_in_basic_block (rtx_insn *insn)
> +{
> +  insn = NEXT_INSN (insn);
> +
> +  while (insn != NULL_RTX)
> +    {
> +      /* If the basic block ends or there is a jump of some kind, exit the
> +	 loop.  */
> +      if (CALL_P (insn)
> +	  || JUMP_P (insn)
> +	  || JUMP_TABLE_DATA_P (insn)
> +	  || LABEL_P (insn)
> +	  || BARRIER_P (insn))
> +	return NULL;
> +
> +      /* If this is a real insn, return it.  */
> +      if (!insn->deleted ()
> +	  && NONJUMP_INSN_P (insn)
> +	  && GET_CODE (PATTERN (insn)) != USE
> +	  && GET_CODE (PATTERN (insn)) != CLOBBER)
> +	return insn;
> +
> +      /* Loop for USE, CLOBBER, DEBUG_INSN, NOTEs.  */
> +      insn = NEXT_INSN (insn);
> +    }
> +
> +  return NULL;
> +}

There are things to do this.  Please don't write code manually parsing
RTL unless you have to.

> +// Validate that a load is actually a single instruction that can be optimized
> +// with the PCREL_OPT optimization.
> +
> +static bool
> +is_single_instruction (rtx_insn *insn, rtx reg)

Of course it is a single instruction!  A single RTL instruction...
Clarify as "single machine instruction"?

> +{
> +  if (!REG_P (reg) && !SUBREG_P (reg))
> +    return false;

You need to check if is a subreg of reg, then.  There are subregs of
other things.  Not that you should see those here, but still.

> +  // _Decimal128 and IBM extended double are always multiple instructions.
> +  machine_mode mode = GET_MODE (reg);
> +  if (mode == TFmode && !TARGET_IEEEQUAD)
> +    return false;
> +
> +  if (mode == TDmode || mode == IFmode)
> +    return false;

Don't we have a helper for this?  The ibm128 part at least.

Maybe we should just have an attribute on the insns that *can* get this
optimisation?

> +  return (base_reg_operand (XEXP (addr, 0), Pmode)
> +	  && satisfies_constraint_I (XEXP (addr, 1)));

short_cint_operand.  But maybe just rs6000_legitimate_offset_address_p?

>  /* Flags that need to be turned off if -mno-power10.  */
>  #define OTHER_POWER10_MASKS	(OPTION_MASK_MMA			\
>  				 | OPTION_MASK_PCREL			\
> +				 | OPTION_MASK_PCREL_OPT		\
>  				 | OPTION_MASK_PREFIXED)

This isn't a CPU flag.  Instead, it is just an optimisation option.

> @@ -8515,7 +8522,10 @@ rs6000_delegitimize_address (rtx orig_x)
>  {
>    rtx x, y, offset;
>  
> -  if (GET_CODE (orig_x) == UNSPEC && XINT (orig_x, 1) == UNSPEC_FUSION_GPR)
> +  if (GET_CODE (orig_x) == UNSPEC
> +      && (XINT (orig_x, 1) == UNSPEC_FUSION_GPR
> +	  || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR
> +	  || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG))
>      orig_x = XVECEXP (orig_x, 0, 0);
>  
>    orig_x = delegitimize_mem_from_attrs (orig_x);

Why this?  It needs an explanation (in the code).

> @@ -13197,6 +13207,19 @@ print_operand (FILE *file, rtx x, int code)
>  	fprintf (file, "%d", 128 >> (REGNO (x) - CR0_REGNO));
>        return;
>  
> +    case 'r':
> +      /* X is a label number for the PCREL_OPT optimization.  Emit the .reloc
> +	 to enable this optimization, unless the value is 0.  */
> +      gcc_assert (CONST_INT_P (x));
> +      if (UINTVAL (x) != 0)
> +	{
> +	  unsigned int label_num = UINTVAL (x);
> +	  fprintf (file,
> +		   ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t",
> +		   label_num, label_num);
> +	}
> +      return;

Don't eat output modifiers please.  We have only so few left, and we
cannot recycle any more without pain.

I don't see why we cannot just do this in the normal output (C) code of
the one or few insns that want this?

>  ;; The ISA we implement.
> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,pcrel_opt"
>    (const_string "any"))

No.  Please read the heading.


Segher


More information about the Gcc-patches mailing list