This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, RFC] Expose target addressing modes before expand


Problem report 46556 (http://gcc.gnu.org/PR46556) identifies a code-size
regression for powerpc targets.  The root of the problem is changes in
address canonicalization a couple of years back.  Discussion in the PR
suggests some form of exposure of target addressing modes prior to the
expand phase.  Steven Bosscher (CCed) suggested this be done in a late
GIMPLE pass, using logic similar to what IVopts does for loops.

Following is an early prototype of such a pass.  I've verified that it
restores the regressed code to its earlier, more compact form, and most
of the addressing sequences on powerpc appear to be better or the same
as before.  There are at least some complex expressions where early
exposure produces worse code in the form of extra multiplies; I have
some ideas for how to tune that down so the shift-add machinery for
multiplies isn't derailed.  There are still 16 test cases that regress
on rev 169834 for powerpc, and I haven't examined other targets yet.
Clearly this is not yet ready, but before investing time into further
refinement, I'd like to get commentary on the general approach, as well
as any specific concerns.  Please bear in mind that I am new to GCC, so
I'm still familiarizing myself with the infrastructure and am likely to
miss some things that are obvious to the initiated...

The general approach is to rely on existing machinery used by
tree-ssa-loop-ivopts.c, simplifying out the parts that have to do with
induction variable optimizations.  The new logic is pretty simple, but
the interactions with later phases may not be.  For now, I've placed the
new pass in passes.c as the last -O1 tree-ssa pass.

In addition to general comments, I'd be interested in answers to some
specific questions:

1) Are there concerns with early exposure of target addressing nodes on
other targets?  Should the pass be gated to only operate on a subset of
targets?
2) Is the pass placement appropriate?
3) Have I missed any necessary fix-ups after modifying the GIMPLE code?

Thanks very much for your insights!

Bill


[gcc]
2011-02-04  William Schmidt  <wschmidt@linux.vnet.ibm.com>

        * tree.h (copy_ref_info): New prototype of existing function.
        * tree-pass.h (pass_tree_addr_mode): New declaration.
        * tree-ssa-loop-ivopts.c (may_be_unaligned_p, copy_ref_info):
        Remove static keyword.
        * timevar.def (TV_TREE_ADDR_MODE): New timevar definition.
        * tree-ssa-address.c (addr_to_parts): Change to handle null
        iv_cand parameter.
        * tree-ssa-addr-mode.c:  New.
        * tree-flow.h (may_be_unaligned_p): New prototype of existing
        function.
        * Makefile.in: Add build step for tree-ssa-addr-mode.o.
        * passes.c: Add pass for pass_tree_addr_mode.

Index: gcc/tree.h
===================================================================
--- gcc/tree.h  (revision 169834)
+++ gcc/tree.h  (working copy)
@@ -5580,6 +5580,9 @@
 extern tree tree_mem_ref_addr (tree, tree);
 extern void copy_mem_ref_info (tree, tree);
 
+/* In tree-ssa-loop-ivopts.c */
+extern void copy_ref_info (tree, tree);
+
 /* In tree-vrp.c */
 extern bool ssa_name_nonnegative_p (const_tree);
 
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h     (revision 169834)
+++ gcc/tree-pass.h     (working copy)
@@ -385,6 +385,7 @@
 extern struct gimple_opt_pass pass_parallelize_loops;
 extern struct gimple_opt_pass pass_loop_prefetch;
 extern struct gimple_opt_pass pass_iv_optimize;
+extern struct gimple_opt_pass pass_tree_addr_mode;
 extern struct gimple_opt_pass pass_tree_loop_done;
 extern struct gimple_opt_pass pass_ch;
 extern struct gimple_opt_pass pass_ccp;
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c  (revision 169834)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -1604,7 +1604,7 @@
 
 /* Returns true if memory reference REF with step STEP may be unaligned.  */
 
-static bool
+bool
 may_be_unaligned_p (tree ref, tree step)
 {
   tree base;
@@ -5916,7 +5916,7 @@
 
 /* Copies the reference information from OLD_REF to NEW_REF.  */
 
-static void
+void
 copy_ref_info (tree new_ref, tree old_ref)
 {
   tree new_ptr_base = NULL_TREE;
Index: gcc/timevar.def
===================================================================
--- gcc/timevar.def     (revision 169834)
+++ gcc/timevar.def     (working copy)
@@ -236,6 +236,7 @@
 DEFTIMEVAR (TV_TREE_UNINIT           , "uninit var analysis")
 DEFTIMEVAR (TV_PLUGIN_INIT           , "plugin initialization")
 DEFTIMEVAR (TV_PLUGIN_RUN            , "plugin execution")
+DEFTIMEVAR (TV_TREE_ADDR_MODE        , "expose addressing modes")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL          , "early local passes")
Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c      (revision 169834)
+++ gcc/tree-ssa-address.c      (working copy)
@@ -631,7 +631,7 @@
      is <= 2 -- in that case, no loop invariant code motion can be
      exposed.  */
 
-  if (!base_hint && (addr->n > 2))
+  if (!base_hint && (addr->n > 2) && iv_cand)
     move_variant_to_index (parts, addr, iv_cand);
 
   /* First move the most expensive feasible multiplication
Index: gcc/tree-ssa-addr-mode.c
===================================================================
--- gcc/tree-ssa-addr-mode.c    (revision 0)
+++ gcc/tree-ssa-addr-mode.c    (revision 0)
@@ -0,0 +1,251 @@
+/* Expose target addressing modes in gimple representation.
+   Copyright (C) 2011
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* This pass performs a scan over the gimple representation to expose
+   target machine addressing in reference-class expressions.  This is
+   necessary for some targets for which the current address
+   canonicalization is suboptimal.  Induction variable optimizations
+   already expose target addressing for many (but not all) expressions
+   in loops, so this has most benefit in non-loop code.  Algorithms
+   used here are simplifications of those used in the induction
+   variable optimizations. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "basic-block.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "timevar.h"
+#include "cfgloop.h"
+#include "gimple.h"
+#include "tree-flow.h"
+#include "tree-affine.h"
+
+extern void copy_ref_info (tree, tree);
+
+static bool
+tree_contains_mem_ref_p (tree base)
+{
+  unsigned int i;
+  enum tree_code code;
+
+  if (!base)
+    return false;
+
+  code = TREE_CODE (base);
+  if (code == TARGET_MEM_REF || code == MEM_REF)
+    return true;
+
+  for (i = 0; i < TREE_CODE_LENGTH (code); i++)
+    if (tree_contains_mem_ref_p (TREE_OPERAND (base, i)))
+      return true;
+
+  return false;
+}
+
+/* Expose target addressing modes in EXPR.  Logic extracted and
+   simplified from tree-ssa-loop-ivopts.c to handle non-ivar
+   opportunities. */
+
+static bool
+tree_ssa_addr_mode_tree (tree *expr, gimple_stmt_iterator *gsi, bool speed)
+{
+  tree base = *expr, mem_ref;
+  struct pointer_map_t *cache;
+  aff_tree aff;
+
+  if (REFERENCE_CLASS_P (*expr))
+    {
+      /* If the expression is a virtual method table reference, don't attempt
+        this.  Incorrect code generation occurs; an example test case is
+        testsuite/g++.dg/opt/pr47366.C. */
+      if (TREE_CODE (base) == COMPONENT_REF
+         && DECL_VIRTUAL_P (TREE_OPERAND (base, 1)))
+       return false;
+
+      base = unshare_expr (base);
+
+      /*
+      if (TREE_CODE (base) == TARGET_MEM_REF)
+       {
+         tree type = build_pointer_type (TREE_TYPE (base));
+         base = tree_mem_ref_addr (type, base);
+       }
+      */
+      /* If the addressing mode is already exposed, let it be.
+         #### I am not certain this doesn't miss out on some 
+         opportunities, so just in case I'm leaving the above
+         code commented out in place for now.  But I think all this
+         does is make things more explicit that CSE can catch
+         anyhow, and in some cases it produces expressions that
+         the may-aliasing infrastructure can't handle. */
+      if (tree_contains_mem_ref_p (base))
+       return false;
+      else 
+       {
+         /* Check that the base expression is addressable. */
+         if (may_be_nonaddressable_p (base))
+           return false;
+
+         /* On strict alignment platforms, check that the base expression
+            is sufficiently aligned.  There is no additional "step"
+             information required, so specify single-byte alignment
+             for that parameter.  */
+         if (STRICT_ALIGNMENT && may_be_unaligned_p (base, size_one_node))
+           return false;
+
+         base = build_fold_addr_expr (base);
+       }
+
+      /* Express the address of the original expression as an affine
+         combination of trees, looking through SSA defs. */
+      cache = pointer_map_create ();
+      tree_to_aff_combination_expand (base, TREE_TYPE (base), &aff, &cache);
+      pointer_map_destroy (cache);
+
+      /* It is possible for aff to have no elements, in which case do
+        nothing.  Example: lhs = MEM[(struct foo *) 0B].bar does not
+         produce an affine combination. */
+      if (!aff.n)
+       return false;
+
+      /* Replace the input expression with an equivalent memory reference
+        legal on the target machine. */
+      mem_ref = create_mem_ref (gsi, TREE_TYPE (*expr), &aff, 
+                                reference_alias_ptr_type (*expr),
+                               NULL_TREE, NULL_TREE, speed);
+
+      /* Ensure the memory reference was not built with a base expression
+        of zero, which confuses the may-aliasing in subsequent dead
+        store elimination.  Assume this isn't a useful replacement when
+        this occurs. */
+      if (integer_zerop (TREE_OPERAND (mem_ref, 0)))
+       return false;
+
+      /* Finish the replacement. */
+      copy_ref_info (mem_ref, *expr);
+      *expr = mem_ref;
+
+      return true;
+    }
+  return false;
+}
+
+/* Expose target addressing modes in STMT, which must be an assignment. */
+
+static void
+tree_ssa_addr_mode_stmt (gimple stmt, gimple_stmt_iterator *gsi, bool speed)
+{
+  tree *lhs, *rhs1, *rhs2, *rhs3;
+  enum tree_code subcode = gimple_assign_rhs_code (stmt);
+  enum gimple_rhs_class rhs_class = get_gimple_rhs_class (subcode);
+  bool changed;
+
+  /* Bitfields are being ignored for now, as they are in induction
+     variable optimization. */
+  if (subcode == BIT_FIELD_REF)
+    return;
+
+  lhs = gimple_assign_lhs_ptr (stmt);
+  changed = tree_ssa_addr_mode_tree (lhs, gsi, speed);
+
+  rhs1 = gimple_assign_rhs1_ptr (stmt);
+  changed = tree_ssa_addr_mode_tree (rhs1, gsi, speed) || changed;
+
+  if (rhs_class == GIMPLE_BINARY_RHS || rhs_class == GIMPLE_TERNARY_RHS)
+    {
+      rhs2 = gimple_assign_rhs2_ptr (stmt);
+      changed = tree_ssa_addr_mode_tree (rhs2, gsi, speed) || changed;
+    }
+
+  if (rhs_class == GIMPLE_TERNARY_RHS)
+    {
+      rhs3 = gimple_assign_rhs3_ptr (stmt);
+      changed = tree_ssa_addr_mode_tree (rhs3, gsi, speed) || changed;
+    }
+
+  if (changed)
+    update_stmt (stmt);
+}
+
+/* Process the statements in block BB. */
+
+static void
+tree_ssa_addr_mode_block (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+
+  bool speed = optimize_bb_for_speed_p (bb);
+
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      /* #### Avoiding volatile memory refs like tree-ssa-loop-ivopts.c
+         does, but need to revisit whether that's necessary. */
+
+      if (is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt))
+       {
+         tree_ssa_addr_mode_stmt (stmt, &gsi, speed);
+       }
+    }
+}
+
+/* Main entry point for exposing target addressing modes not caught
+   by tree-ssa-loop-ivopts.c. */
+
+static unsigned int
+tree_ssa_addr_mode (void)
+{
+  basic_block bb;
+  bool chicken_switch = false;
+
+  /* #### Temporary debug aid; set to true to disable this phase in debugger. */
+  if (chicken_switch)
+    return 0;
+
+  FOR_EACH_BB (bb)
+    tree_ssa_addr_mode_block (bb);
+
+  return 0;
+}
+
+struct gimple_opt_pass pass_tree_addr_mode =
+{
+ {
+  GIMPLE_PASS,
+  "addrmode",                          /* name */
+  NULL,                                        /* gate */
+  tree_ssa_addr_mode,                  /* execute */
+  NULL,                                        /* sub */
+  NULL,                                        /* next */
+  0,                                   /* static_pass_number */
+  TV_TREE_ADDR_MODE,                   /* tv_id */
+  PROP_cfg,                            /* properties_required */
+  0,                                   /* properties_provided */
+  0,                                   /* properties_destroyed */
+  0,                                   /* todo_flags_start */
+  /* #### Probably can remove TODO_rebuild_alias. */
+  TODO_dump_func | TODO_update_ssa |
+    TODO_rebuild_alias                         /* todo_flags_finish */
+ }
+};
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h     (revision 169834)
+++ gcc/tree-flow.h     (working copy)
@@ -808,6 +808,7 @@
                                      addr_space_t);
 unsigned multiply_by_cost (HOST_WIDE_INT, enum machine_mode, bool);
 bool may_be_nonaddressable_p (tree expr);
+bool may_be_unaligned_p (tree ref, tree step);
 
 /* In tree-ssa-threadupdate.c.  */
 extern bool thread_through_all_blocks (bool);
Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in     (revision 169834)
+++ gcc/Makefile.in     (working copy)
@@ -1389,6 +1389,7 @@
        tree-sra.o \
        tree-switch-conversion.o \
        tree-ssa-address.o \
+       tree-ssa-addr-mode.o \
        tree-ssa-alias.o \
        tree-ssa-ccp.o \
        tree-ssa-coalesce.o \
@@ -2556,6 +2557,9 @@
    $(TREE_PASS_H) $(FLAGS_H) $(TREE_INLINE_H) $(RECOG_H) insn-config.h \
    $(EXPR_H) gt-tree-ssa-address.h $(GGC_H) tree-affine.h $(TARGET_H) \
    tree-pretty-print.h
+tree-ssa-addr-mode.o : tree-ssa-addr-mode.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TREE_H) $(BASIC_BLOCK_H) $(TREE_DUMP_H) $(TREE_PASS_H) \
+   $(TIMEVAR_H) $(CFGLOOP_H) $(GIMPLE_H) $(TREE_FLOW_H) $(TREE_AFFINE_H)
 tree-ssa-loop-niter.o : tree-ssa-loop-niter.c $(TREE_FLOW_H) $(CONFIG_H) \
    $(SYSTEM_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) $(PARAMS_H) \
    $(TREE_INLINE_H) output.h $(DIAGNOSTIC_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
Index: gcc/passes.c
===================================================================
--- gcc/passes.c        (revision 169834)
+++ gcc/passes.c        (working copy)
@@ -944,6 +944,10 @@
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);
       NEXT_PASS (pass_local_pure_const);
+      /* Note: pass_tree_addr_mode should be as close to pass_expand as
+        possible.  Particularly must be after DCE since DCE has
+         trouble seeing through some of the generated mem_refs. */
+      NEXT_PASS (pass_tree_addr_mode);
     }
   NEXT_PASS (pass_lower_complex_O0);
   NEXT_PASS (pass_cleanup_eh);



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]