[PATCH,rs6000] Combine patterns for p10 load-cmpi fusion

Segher Boessenkool segher@kernel.crashing.org
Tue Jan 26 00:51:21 GMT 2021


Hi!

On Fri, Dec 04, 2020 at 01:19:11PM -0600, acsawdey@linux.ibm.com wrote:
> This patch adds the first batch of patterns to support p10 fusion. These
> will allow combine to create a single insn for a pair of instructions
> that that power10 can fuse and execute. These particular ones have the
> requirement that only cr0 can be used when fusing a load with a compare
> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
> to put that requirement in, and if it doesn't work out later the splitter
> can get used.

> The patterns are generated by a script genfusion.pl and live in new file
> fusion.md. This script will be expanded to generate more patterns for
> fusion.

> This also adds option -mpower10-fusion which defaults on for power10 and
> will gate all these fusion patterns. In addition I have added an
> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> that just controls the load+compare-immediate patterns. I have make
> these default on for power10 but they are not disallowed for earlier
> processors because it is still valid code. This allows us to test the
> correctness of fusion code generation by turning it on explicitly.

> 	* config/rs6000/genfusion.pl: New file, script to generate
> 	define_insn_and_split patterns so combine can arrange fused
> 	instructions next to each other.
> 	* config/rs6000/fusion.md: New file, generated fused instruction
> 	patterns for combine.

So this script is never run by any target, you have to do it all
manually?  Okay.

> 	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
> 	(non_update_memory_operand): New predicate.
> 	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
> 	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
> 	POWERPC_MASKS.
> 	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
> 	prototype.
> 	* config/rs6000/rs6000.c (rs6000_option_override_internal):
> 	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>  	if target is power10.

Capital A.  And, you do not set any "-m", you set OPTION_MASK_P10_FUSION,
instead.

Every new entry starts on a new line:

	(rs600_opt_masks): Allow -mpower10-fusion in function attributes.
	(address_is_non_pfx_d_or_x): New function.

> 	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
> 	* config/rs6000/rs6000.md: Include fusion.md.
> 	* config/rs6000/rs6000.opt: Add -mpower10-fusion
> 	and -mpower10-fusion-ld-cmpi.
> 	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.

> --- /dev/null
> +++ b/gcc/config/rs6000/fusion.md
> @@ -0,0 +1,357 @@
> +;; -*- buffer-read-only: t -*-

Don't do these things please.  You can make the file r/o if you really
want to, that works for all (sane) editors (and works fine in Git).

> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))

The indent here is wrong?  Just use all spaces, that is fine, and easier
than getting tabs right in a generated file.

> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"

I would make TARGET_P10_FUSION_LD_CMPI imply TARGET_P10_FUSION, that
makes all further code simpler, no?

> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"

Space after comma.  Line too long (I think you can easily break it in the
source?  If not, no man overboard :-) )

> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]

(Here, the line before where you use a tab should have one, already.)

If you write the first arm on one line, than why not this one (or at
least start the rhs on the same line as the lhs, etc.)

> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"

This should use
  cmpldi %2,%0,%3
(not all assemblers allow bare "0", some require "cr0").

> --- /dev/null
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -0,0 +1,144 @@
> +#!/usr/bin/perl -w

Since Perl 5.6, you can use
  use warnings;
instead of -w.

Also:
  use strict;
(which will catch a few errors you made; nothing super serious, but
still).

> +# Generate fusion.md 

Trailing space.  Sentences end with a full stop; if you do not like that
here, maybe just add an empty line after it so it is clear it is a
heading?  Not a terrible idea in the first place :-)

> +# Copyright (C) 2020 Free Software Foundation, Inc.

Don't forget to update the year :-)

> +my $copyright =  <<'EOF';
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +EOF
> +
> +print $copyright;

You can just

print <<'EOF';
bla
bla
bla
EOF

> +sub mode_to_ldst_char
> +{
> +    my ($mode) = @_;
> +    if ($mode eq 'DI') { return 'd'; }
> +    if ($mode eq 'SI') { return 'w'; }
> +    if ($mode eq 'HI') { return 'h'; }
> +    if ($mode eq 'QI') { return 'b'; }
> +    return '?';
> +}

That invites using a hash:
  my %x = (DI => 'd', SI => 'w', HI => 'h', QI => 'b');
(or even  %x = qw(DI d  SI w  HI h  QI b);  )
  return $x{$mode} if exists $x{$mode};
  return '?';

> +sub gen_ld_cmpi_p10
> +{
> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {

Here, you haven't declared $lmode; strict will error about that (use
"my $lmode").

> +      $ldst = mode_to_ldst_char($lmode);
> +      $clobbermode = $lmode;
> +      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.

(line too long)

> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
> +	# EXTDI does not exist, and we cannot directly produce HI/QI results.
> +	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
> +	# Don't allow EXTQI because that would allow HI result which we can't do.
> +	if ( $result eq "EXTQI" ) { $result = "GPR"; }

No spaces inside parens please.

Other ways to write this are
  $result eq "EXTQI" and $result = "GPR";
and
  $result = "GPR" if $result eq "EXTQI";
both of which are a bit easier to read.

> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
> +	  $np = "NON_PREFIXED_D";
> +	  if ( $ccmode eq 'CC' ) {
> +	      next CCMODE if $lmode eq 'QI';
> +	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +		  # ld and lwa are both DS-FORM.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "";
> +	      $echr = "a";
> +	      $constpred = "const_m1_to_1_operand";
> +	  } else {
> +	      if ( $lmode eq 'DI' ) {
> +		  # ld is DS-form, but lwz is not.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "l";
> +	      $echr = "z";
> +	      $constpred = "const_0_to_1_operand";
> +	  }
> +	  if ($lmode eq 'DI') { $echr = ""; }
> +	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {

Please use a bit stricter regexp (maybe /^EXT/), too lax regexps will
always surprise you what all they can match, in the end :-)

> +	      # We always need extension if result > lmode.
> +	      if ( $ccmode eq 'CC' ) {
> +		  $extend = "sign";
> +	      } else {
> +		  $extend = "zero";
> +	      }
> +	  } else {
> +	      # Result of SI/DI does not need sign extension.
> +	      $extend = "none";
> +	  }
> +	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
> +	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
> +
> +	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";

You can use qq// if you need to include double quotes in a double-quoted
string, like
  print qq(Oh look, "some quotes", isn't it great!\n);
or
  print qq/Oh look, "some quotes", isn't it great!\n/;
or pretty much whatever other character you want as separator :-)

> +	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
> +	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
> +	  if ($result eq 'clobber') {
> +	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
> +	  } elsif ($result eq $lmode) {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
> +	  } else {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
> +	  }
> +	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
> +	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";

s/ 0/ %2/

> +	  print "  \"&& reload_completed\n";
> +	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
> +	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";

	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),\n";
	  print "                                      ${lmode}mode, ${np}))\"\n";

or something like that.

> +	  if ($extend eq "none") {
> +	      print "  [(set (match_dup 0) (match_dup 1))\n";
> +	  } else {
> +	      $resultmode = $result;
> +	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
> +	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
> +	  }
> +	  print "   (set (match_dup 2)\n";
> +	  print "        (compare:${ccmode} (match_dup 0)\n";
> +	  print "		    (match_dup 3)))]\n";
> +	  print "  \"\"\n";
> +	  print "  [(set_attr \"type\" \"load\")\n";
> +	  print "   (set_attr \"cost\" \"8\")\n";
> +	  print "   (set_attr \"length\" \"8\")])\n";
> +	  print "\n";

You can also use a here-document (<<) for long prints (you can
interpolate variables in that just fine if you use <<"HERE", i.e.
double-quote the terminator string).

Anyway, the only thing you really need to improve in the Perl code now
is "use strict;".  The rest you can do later :-)

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>    if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>      rs6000_isa_flags |= OPTION_MASK_MMA;
>  
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)

  if (TARGET_POWER10
      && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)

> +bool
> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +			   enum non_prefixed_form non_prefixed_format)
> +{
> +  enum insn_form result_form;
> +
> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
> +
> +  switch (non_prefixed_format)
> +    {
> +    case NON_PREFIXED_D:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_D:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}

"default: break;" always is superfluous.  Also, please "return false;"
everywhere you do "break" to just get there.

> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
>  	$(COMPILE) $<
>  	$(POSTCOMPILE)
>  
> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md

Ah, so you *do* generate it always.  Hrm, I'm not sure I like that,
certainly not now.  Comment out this line, and maybe enable it again in
stage 1?

Okay for trunk with those things taken into account.  Thank you!


Segher


More information about the Gcc-patches mailing list