This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2, rs6000] Replace X-form addressing with D-form addressing in new pass for Power 9

From: Segher Boessenkool <segher at kernel dot crashing dot org>
To: Kelvin Nilsen <kdnilsen at linux dot ibm dot com>
Cc: gcc-patches List <gcc-patches at gcc dot gnu dot org>
Date: Thu, 26 Sep 2019 19:27:12 -0500
Subject: Re: [PATCH v2, rs6000] Replace X-form addressing with D-form addressing in new pass for Power 9
References: <37f07932-ed2f-e440-28ee-8f6094f589dc@linux.ibm.com>

Hi Kelvin,

Sorry for the slow review.

On Tue, Sep 03, 2019 at 03:10:09PM -0500, Kelvin Nilsen wrote:
> This new pass scans existing rtl expressions and replaces them with rtl expressions that favor selection of the D-form instructions in contexts for which the D-form instructions are preferred.  The new pass runs after the RTL loop optimizations since loop unrolling often introduces opportunities for beneficial replacements of X-form addressing instructions.

And that is before the "big" RTL passes (cprop, CSE, combine, and related).
Okay.

> --- gcc/config/rs6000/rs6000-p9dform.c	(nonexistent)
> +++ gcc/config/rs6000/rs6000-p9dform.c	(working copy)
> @@ -0,0 +1,1487 @@
> +/* Subroutines used to transform array subscripting expressions into
> +   forms that are more amenable to d-form instruction selection for p9
> +   little-endian VSX code.
> +   Copyright (C) 1991-2018 Free Software Foundation, Inc.

(Year needs updating here).

> +   This pass runs immediately after pass_loops2.  Loops have already

(typo, pass_loop2)

> +/* This is based on the union-find logic in web.c.  web_entry_base is
> +   defined in df.h.  */
> +class indexing_web_entry : public web_entry_base
> +{
> + public:

Most places have no extra leading space here.  Not sure what the coding
standards say?

> +static int count_links (struct df_link *def_link)
> +{
> +  int result;
> +  for (result = 0; def_link != NULL; result++)
> +    def_link = def_link->next;
> +  return result;
> +}

Maybe this belongs in more generic code?  Or as inline in df.h even?

> +
> +  int result = 0;
> +  for (i = 0; i < count; i++)
> +    {
> +      result = (result << 6) | ((result & 0xf000000) >> 28);

Do you want an extra zero in the constant there?  The expression  is 0
as written.

Shifting a signed (possibly negative!) number to the right isn't a great
idea in any case.

Maybe you want to use some standard hash, instead?

> +	  /* if there is no def, or the def is artificial,
> +	     or there are multiple defs, this is an originating
> +	     use.  */

(Sentence should start with a capital).

> +	      unsigned int uid2 =
> +		insn_entry[uid].original_base_use;

The = goes on the second line, instead.

> +  rtx mem = NULL;
> +  if (GET_CODE (body) == SET)
> +    {
> +      if (MEM_P (SET_SRC (body)))
> +	mem = XEXP (SET_SRC (body), 0);
> +      else if (MEM_P (SET_DEST (body)))
> +	mem = XEXP (SET_DEST (body), 0);
> +    }
> +  /* Otherwise, this is neither load or store, so leave mem as NULL.  */
> +
> +  if (mem != NULL)
> +    {

You could instead say

  if (!mem)
    return;

and then you don't need the comment or the "if" after it.  Or even:

  if (GET_CODE (body) != SET)
    return;

  rtx mem;
  if (MEM_P (SET_SRC (body)))
    mem = XEXP (SET_SRC (body), 0);
  else if (MEM_P (SET_DEST (body)))
    mem = XEXP (SET_DEST (body), 0);
  else
    return;

> +    {
> +      rtx def_insn = DF_REF_INSN (def_link->ref);
> +      /* unsigned uid = INSN_UID (def_insn); not needed? */

Delete this line?  Or is it useful still :-)

> +      if (GET_CODE (body) == CONST_INT)
> +	return true;

if (CONST_INT_P (body)) ...

> +static bool
> +in_use_list (struct df_link *list, struct df_link *element)
> +{
> +  while (list != NULL)
> +    {
> +      if (element->ref == list->ref)
> +	return true;
> +      list = list->next;
> +    }
> +  /* Got to end of list without finding element.  */
> +  return false;
> +}

This kind of function makes me scared of quadratic complexity somewhere.
I didn't check, but it seems likely :-/

> +  if ((insn_entry[uid_1].base_equivalence_hash !=
> +       insn_entry[uid_2].base_equivalence_hash) ||
> +      (insn_entry[uid_1].index_equivalence_hash !=
> +       insn_entry[uid_2].index_equivalence_hash))
> +    return false;
> +  else				/* Hash codes match.  Check details.  */

After a return you don't need an else.  Reduce indentation, instead :-)

> +/* Return true iff instruction I2 dominates instruction I1.  */
> +static bool
> +insn_dominated_by_p (rtx_insn *i1, rtx_insn *i2)
> +{
> +  basic_block bb1 = BLOCK_FOR_INSN (i1);
> +  basic_block bb2 = BLOCK_FOR_INSN (i2);
> +
> +  if (bb1 == bb2)
> +    {
> +      for (rtx_insn *i = i2;
> +	   (i != NULL) && (BLOCK_FOR_INSN (i) == bb1); i = NEXT_INSN (i))

How can it ever be NULL?

> +	if (i == i1)
> +	  return true;
> +      return false;
> +    }
> +  else
> +    return dominated_by_p (CDI_DOMINATORS, bb1, bb2);
> +}

IRA already has an insn_dominated_by_p function, which is constant time
(it does require some extra init for uid_luid[]).

> +	  int hash = ((insn_entry [uid].base_equivalence_hash +

No + at the end of line, either.  The check_GNU_style script should
check for this (not sure if it actually does, YMMV).

> +  /* Since this pass creates new instructions, get_max_uid () may
> +     return different values at different times during this pass.  The
> +     insn_entry array does not represent any of the instructions newly
> +     inserted during this pass.  */
> +  max_uid_at_start = get_max_uid ();
> +  insn_entry = XCNEWVEC (indexing_web_entry, max_uid_at_start);

Maybe you should have some assert checking you do not try to deref that
array out of bounds somewhere?

> +	  /*
> +	   * First, look for all memory[base + index] expressions.
> +	   * Then group these by base.

That's not the comment style GCC uses (no leading stars or other frills).

> --- gcc/config/rs6000/rs6000-passes.def	(revision 275051)
> +++ gcc/config/rs6000/rs6000-passes.def	(working copy)
> @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
>     INSERT_PASS_AFTER (PASS, INSTANCE, TGT_PASS)
>     INSERT_PASS_BEFORE (PASS, INSTANCE, TGT_PASS)
>     REPLACE_PASS (PASS, INSTANCE, TGT_PASS)
> +   Be advised: gawk program does not parse C comments if inserted below.
>   */

Heh, you were hit by that as well...  Annoying / confusing, right.  Maybe
the script can be easily improved?

> +/* This function provides an approximation of which d-form addressing
> +   expressions are valid on any given target configuration.  This
> +   approximation guides optimization choices.  Secondary validation
> +   of the addressing mode is performed before code generation.
> +
> +   Return true iff target has instructions to perform a memory
> +   operation at the specified BYTE_OFFSET from an address held
> +   in a general purpose register.  */

Mike's patch series will conflict with this.

> +bool
> +rs6000_target_supports_dform_offset_p (machine_mode mode,
> +				       HOST_WIDE_INT byte_offset)
> +{
> +  const HOST_WIDE_INT max_16bit_signed = (0x7fffffff);

Really?  Not 0x7fff?

> +  /* Available d-form instructionw with v2.03:

(typo)

> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" "-mdejagnu-cpu=*" } { "-mdejagnu-cpu=power9" } } */

-mdejagnu-cpu= makes that whole "skip" business unnecessary.

> +/* { dg-options "-O3 -mdejagnu-cpu=power9 -mtune=power9 -funroll-loops" } */

No -mtune= is needed.

> +/* { dg-do compile { target { powerpc*-*-* } } } */

Everything in gcc.target/powerpc/ is target powerpc*-*-*; you don't have
to say that in testcases.

I'm skipping the rest of the testcases for now.


So, okay, it looks like it works :-)  And I have confidence the testcases
will exercise most of the code.  But I have two main concerns.

First, some of the code is quadratic.  This is esp. bad for things that
iterate over insns in a basic block, since you easily can have 10k or
more insns in a bb.  But it isn't good for iterating over bbs either, you
can have many of those as well.  And df_link's, etc.

Second, the main core of this is a few huge routines, with many nested
loops and conditionals.  This is hard to follow.


Can you fix those things?  (And the comments above, most of which are nits,
but most of those apply many times).


Segher

References:
- [PATCH v2, rs6000] Replace X-form addressing with D-form addressing in new pass for Power 9
  - From: Kelvin Nilsen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]