[PATCH,rs6000] [v2] Optimize pcrel access of globals

Segher Boessenkool segher@kernel.crashing.org
Wed Feb 24 22:49:29 GMT 2021


On Mon, Feb 22, 2021 at 09:24:04PM -0600, acsawdey@linux.ibm.com wrote:
> This patch implements a RTL pass that looks for pc-relative loads of the
> address of an external variable using the PCREL_GOT relocation and a
> single load or store that uses that external address.
> 
> Produced by a cast of thousands:
>  * Michael Meissner
>  * Peter Bergner
>  * Bill Schmidt
>  * Alan Modra
>  * Segher Boessenkool
>  * Aaron Sawdey
> 
> This incorporates the changes requested in Segher's review. A few things I
> did not change were the insn-at-a-time scan that could be done with DF,

Yes, that would be a bigger (and possibly destabilising) change.

> and
> I did not change to using statistics.[ch] for the counters struct.

Okay.

> I did try
> to improve the naming, and rewrote a number of comments to make them consistent
> with the code, and generally tried to make things more readable.

Great, thank you!

> +(define_insn "*pcrel_opt_ld<mode>_gpr"
> +  [(parallel [(set (match_operand:PCRELOPT_GPR 0 "int_reg_operand" "+r")
> +		   (unspec:PCRELOPT_GPR [(match_operand:PCRELOPT_GPR 1 "d_form_memory" "m")

That last line is a bit long.  One thing you can do is break after the [
character (and then indent two chars).  It's a bit ugly, but if you see
no better alternative :-)

> +  return which_alternative ? "<PCRELOPT_VMX_LD> %0,%1" : "<PCRELOPT_FPR_LD> %0,%1";

Too long as well (but this is easy to break up, and that might be a good
idea even if it wasn't a too long line).

> +      switch (GET_CODE (mem))
> +	{
> +	case BSWAP:            /* LFIWAX/LFIWZX/STFIWX.  */
> +	case UNSPEC:
> +	case UNSPEC_VOLATILE:   /* Leave this alone for obvious reasons.  */
> +	case ROTATE:           /* lxvd2x.  */
> +	case VEC_SELECT:
> +	  return NULL_RTX;
> +	default: ;
> +	}
> +      if (GET_RTX_CLASS (GET_CODE (mem)) != RTX_UNARY)
> +	return NULL_RTX;

All of the above except BSWAP are matched by this class test.  So I
would suggest writing it as

      if (GET_RTX_CLASS (GET_CODE (mem)) != RTX_UNARY
	  || GET_CODE (mem) == BSWAP)
	return 0;

> +      /* Rule out LFIWAX/LFIWZX/STFIWX.  */
> +      if (GET_CODE (mem) == BSWAP)
> +	return NULL_RTX;

This is already handled.

> +static bool
> +insn_references_regno_p (rtx_insn *insn, unsigned int regno,
> +		       enum attr_type type)
> +{
> +  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
> +  /* If we don't have the insn_info for some reason, do not attempt to optimize
> +     this reference.  */
> +  if (!insn_info)
> +    return true;

It is better to make this an assert, and explicitly handle the cases
that legitimately have no insn_info (if any).

> @@ -26213,11 +26291,34 @@ void
>  rs6000_asm_output_opcode (FILE *stream)
>  {
>    if (next_insn_prefixed_p)
> -    fprintf (stream, "p");
> +    {
> +      fprintf (stream, "p");
> +
> +      /* Reset the flag in the case where there are separate insn lines in the
> +	 sequence, so the 'p' is only emitted for the first line.  This shows up
> +	 when we are doing the PCREL_OPT optimization, in that the label created
> +	 with %r<n> would have a leading 'p' printed.  */
> +      next_insn_prefixed_p = false;
> +    }

So this part needs the symbol name changed, to say what this variable
*is*.  The other patch did that, the change will be trivial.

Other than that, this is all completely trivial.  It is in good enough
shape to merge now.  Thank you!  Let's see if Richi allows it into
mainline.


Segher


More information about the Gcc-patches mailing list