This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optional alternative base_expr in finding basis for CAND_REFs


Hi Bill,

Many thanks for the review.

I find your suggestion on using the next_interp field quite enlightening. I prepared a patch which adds changes without modifying the framework. With the patch, the slsr pass now tries to create a second candidate for each memory accessing gimple statement, and chain it to the first one via the next_interp field.

There are two implications in this approach though:

1) For each memory accessing gimple statement, there can be two candidates, and these two candidates can be part of different dependency graphs respectively (based on different base expr). Only one of the dependency graph should be traversed to do replace_refs. Most of the changes in the patch is to handle this implication.

I am aware that you suggest to follow the next-interp chain only when the searching fails for the first interpretation. However, that doesn't work very well, as it can result in worse code-gen. Taking a varied form of the added test slsr-41.c for example:

i1:  a2 [i] [j] = 1;
i2:  a2 [i] [j+1] = 2;
i3:  a2 [i+20] [j] = i;

With the 2nd interpretation created conditionally, the following two dependency chains will be established:

  i1 --> i2  (base expr is an SSA_NAME defined as (a2 + i * 200))
  i1 --> i3  (base expr is a tree expression of (a2 + i * 200))

the result is that three gimples will be lowered to MEM_REFs differently (as the candidates have different base_exprs); the later passes can get confused, generating worse code.

What this patch does is to create two interpretations where possible (if different base exprs exist); the following dependency chains will be produced:

  i1 --> i2  (base expr is an SSA_NAME defined as (a2 + i * 200))
  i1 --> i2 --> i3  (base expr is a tree expression of (a2 + i * 200))

In analyze_candidates_and_replace, a new function preferred_ref_cand is called to analyze a root CAND_REF and replace_refs is only called if this root CAND_REF is found to be part of a larger dependency graph (or longer dependency chain in simple cases). In the example above, the 2nd dependency chain will be picked up to do replace_refs.

2) The 2nd implication is that the alternative candidate may expose the underlying tree expression of a base expr, which can cause more aggressive extraction and folding of immediate offsets. Taking the new test slsr-41 for example, the code-gen difference on x86_64 with the original patch and this patch is (-O2):

-       leal    5(%rsi), %edx
+       leal    5(%rsi), %eax
        movslq  %esi, %rsi
-       salq    $2, %rsi
-       movslq  %edx, %rax
-       leaq    (%rax,%rax,4), %rax
-       leaq    (%rax,%rax,4), %rcx
-       salq    $3, %rcx
-       leaq    (%rdi,%rcx), %rax
-       addq    %rsi, %rax
-       movl    $2, -1980(%rax)
-       movl    %edx, 20(%rax)
-       movl    %edx, 4024(%rax)
-       leaq    -600(%rdi,%rcx), %rax
-       addl    $1, 16(%rsi,%rax)
+       imulq   $204, %rsi, %rsi
+       addq    %rsi, %rdi
+       movl    $2, -980(%rdi)
+       movl    %eax, 1020(%rdi)
+       movl    %eax, 5024(%rdi)
+       addl    $1, 416(%rdi)
        ret

As you can see, the larger offsets are produced as the affine expander is able to look deep into the tree expression. This raises concern that larger immediates can cause worse code-gen when the immediates are out of the supported range on a target. On x86_64 it is not obvious (as it allows larger ranges), on arm cortex-a15 the load with the immediate 5024 will be done by

        movw    r2, #5024
        str     r3, [r0, r2]

which is not optimal. Things can get worse when there are multiple loads/stores with large immediates as each one may require an extra mov immediate instruction. One thing can potentially be done is to reduce the strength of multiple large immediates later on in some RTL pass by doing an initial offsetting first? What do you think? Are you particularly concerned about this issue?

The patch passes the bootstrapping on arm and x86_64; the regtest is still running.

Here is the changelog:

gcc/

        * gimple-ssa-strength-reduction.c: Include tree-affine.h.
        (name_expansions): New static variable.
        (get_alternative_base): New function.
        (restructure_reference): Add new local variables 'alt_base' and
        'delta'; call get_alternative_base and alloc_cand_and_find_basis
        to create an alternative interpretation.
        (num_of_dependents): New function.
        (preferred_ref_cand): Ditto.
        (analyze_candidates_and_replace): Call preferred_ref_cand for
CAND_REF and skip replace_refs if the returned value is differerent.
        (execute_strength_reduction): call free_affine_expand_cache with
        &name_expansions.

gcc/testsuite/

        * gcc.dg/tree-ssa/slsr-41.c: New test.


For your consideration, I've also attached another patch which is an improvement to the original patch. This patch improves the original one by reducing the number of changes to the existing framework, e.g. leaving find_basis_for_base_expr unchanged. While it still slightly modifies the interfaces (find_basis_for_candidate and record_potential_basis), it has advantage over the 1st patch attached here: its impact on the code-gen is much smaller, as it enables more ARRAY_REFs to be lowered without handing over the underlying tree expression to replace_ref. It creates the following dependency chains for the aforementioned example:

  i1 --> i2  (base expr is an SSA_NAME defined as (a2 + i * 200))
  i1 --> i2 --> i3  (base expr is a tree expression of (a2 + i * 200))

While they look the same as what the 1st patch does, only one candidate is generated for each memory accessing gimple statement; some candiates are chained twice, once to a cand_chain with a base_expr of an SSA_NAME and the other to a cand_chain with the underlying tree expression as its base_expr. In other words, it produces two different dependency graphs without creating different interpretations, by utilizing the existing framework of cand_chain and find_basis_for_base_expr.

The patch passes the bootstrapping on arm and x86_64, as well as regtest on x86_64. The following is the changelog entry:

gcc/

        * gimple-ssa-strength-reduction.c: Include tree-affine.h.
        (name_expansions): New static variable.
        (alt_base_map): Ditto.
        (get_alternative_base): New function.
        (find_basis_for_candidate): For CAND_REF, optionally call
        find_basis_for_base_expr with the returned value from
        get_alternative_base.
        (record_potential_basis): Add new parameter 'base' of type 'tree';
        return if base == NULL; use base to set node->base_expr.
(alloc_cand_and_find_basis): Update; call record_potential_basis for
        CAND_REF with the returned value from get_alternative_base.
(execute_strength_reduction): Call pointer_map_create for alt_base_map;
        call free_affine_expand_cache with &name_expansions.

gcc/testsuite/

        * gcc.dg/tree-ssa/slsr-41.c: New test.


Which patch do you like more?

If you have any question on either of the patch, please let me know.

Regards,
Yufeng


On 11/11/13 17:09, Bill Schmidt wrote:
Hi Yufeng,

The idea is a good one but I don't like your implementation of adding an
extra expression parameter to look at on the find_basis_for_candidate
lookup.  This goes against the design of the pass and may not be
sufficiently general (might there be situations where a third possible
basis could exist?).

The overall design is set up to have alternate interpretations of
candidates in the candidate table to handle this sort of ambiguity.  The
goal for your example is create a second candidate (chained to the first
one by way of the next_interp field) so that the candidate table looks
like this:

    8  [2] *_10[j_7(D)] = 2;
       REF  : _10 + ((sizetype) j_7(D) * 4) + 0 : int[20] *
       basis: 0  dependent: 0  sibling: 0
       next-interp: 9  dead-savings: 0

    9  [2] *_10[j_7(D)] = 2;
       REF  : _5 + ((sizetype) j_7(D) * 4) + 800 : int[20] *
       basis: 5  dependent: 0  sibling: 0
       next-interp: 0  dead-savings: 0

This will in turn allow subsequent candidates to be seen in terms of
either _5 or _10, which may be necessary to avoid missed opportunities.
There may be a subsequent REF _15 +... that can be an affine expression
of either of these, for example.

If you fail to find a basis for a candidate with its first
interpretation, you can then follow the next-interp chain to look for a
basis for the next one, without the messy passing of extra possibilities
to the find-basis routine.

I haven't read the patch in detail, but I think this should give you
enough to work with to re-design the idea to fit better with the
existing framework.  Please let me know if you need more information, or
if you feel I've misunderstood something.

Thanks,
Bill

On Mon, 2013-11-04 at 18:41 +0000, Yufeng Zhang wrote:
Hi,

This patch extends the slsr pass to optionally use an alternative base
expression in finding basis for CAND_REFs.  Currently the pass uses
hash-based algorithm to match the base_expr in a candidate.  Given a
test case like the following, slsr will not be able to recognize the two
CAND_REFs have the same basis, as their base_expr are of different
SSA_NAMEs:

typedef int arr_2[20][20];

void foo (arr_2 a2, int i, int j)
{
    a2[i][j] = 1;
    a2[i + 10][j] = 2;
}

The gimple dump before slsr is like the following (using an
arm-none-eabi gcc):

    i.0_2 = (unsigned int) i_1(D);
    _3 = i.0_2 * 80;
    _5 = a2_4(D) + _3;
    *_5[j_7(D)] = 1;<----
    _9 = _3 + 800;
    _10 = a2_4(D) + _9;
    *_10[j_7(D)] = 2;<----

Here are the dumps for the two CAND_REFs generated for the two
statements pointed by the arrows:


    4  [2] _5 = a2_4(D) + _3;
       ADD  : a2_4(D) + (80 * i_1(D)) : int[20] *
       basis: 0  dependent: 0  sibling: 0
       next-interp: 0  dead-savings: 0

    8  [2] *_10[j_7(D)] = 2;
       REF  : _10 + ((sizetype) j_7(D) * 4) + 0 : int[20] *
       basis: 5  dependent: 0  sibling: 0
       next-interp: 0  dead-savings: 0

As mentioned previously, slsr cannot establish that candidate 4 is the
basis for the candidate 8, as they have different base_exprs: a2_4(D)
and _10, respectively.  However, the two references actually only differ
by an immediate offset (800).

This patch uses the tree affine combination facilities to create an
optional alternative base expression to be used in finding (as well as
recording) the basis.  It calls tree_to_aff_combination_expand on
base_expr, reset the offset field of the generated aff_tree to 0 and
generate a tree from it by calling aff_combination_to_tree.

The new tree is recorded as a potential basis, and when
find_basis_for_candidate fails to find a basis for a CAND_REF in its
normal approach, it searches again using a tree expanded in such way.
Such an expanded tree usually discloses the expression behind an
SSA_NAME.  In the example above, instead of seeing the strength
reduction candidate chains like this:

    _5 ->  5
    _10 ->  8

we are now having:

    _5 ->  5
    _10 ->  8
    a2_4(D) + (sizetype) i_1(D) * 80 ->  5 ->  8

With the candidates 5 and 8 linked to the same tree expression (a2_4(D)
+ (sizetype) i_1(D) * 80), slsr is now able to establish that 5 is the
basis of 8.

The patch doesn't attempt to change the content of any CAND_REF though.
   It only enables CAND_REFs which (1) have the same stride and (2) have
the underlying expressions of their base_expr only differ in immediate
offsets,  to be recognized to have the same basis.  The statements with
such CAND_REFs will be lowered to MEM_REFs, and later on the RTL
expander shall be able to fold and re-associate the immediate offsets to
the rightmost side of the addressing expression, and therefore exposes
the common sub-expression successfully.

The code-gen difference of the example code on arm with -O2
-mcpu=cortex-15 is:

          mov     r3, r1, asl #6
-       add     ip, r0, r2, asl #2
          str     lr, [sp, #-4]!
+       mov     ip, #1
+       mov     lr, #2
          add     r1, r3, r1, asl #4
-       mov     lr, #1
-       mov     r3, #2
          add     r0, r0, r1
-       add     r0, r0, #800
-       str     lr, [ip, r1]
-       str     r3, [r0, r2, asl #2]
+       add     r3, r0, r2, asl #2
+       str     ip, [r0, r2, asl #2]
+       str     lr, [r3, #800]
          ldr     pc, [sp], #4

One fewer instruction in this simple case.

The example used in illustration is too simple to show code-gen
difference on x86_64, but the included test case will show the benefit
of the patch quite obviously.

The patch has passed

* bootstrapping on arm and x86_64
* regtest on arm-none-eabi,  aarch64-none-elf and x86_64

There is no regression in SPEC2K on arm or x86_64.

OK to commit to the trunk?

Any comment is welcomed!

Thanks,
Yufeng


gcc/

          * gimple-ssa-strength-reduction.c: Include tree-affine.h.
          (find_basis_for_base_expr): Update comment.
          (find_basis_for_candidate): Add new parameter 'alt_base_expr' of
          type 'tree'.  Optionally call find_basis_for_base_expr with
          'alt_base_expr'.
          (record_potential_basis): Add new parameter 'alt_base_expr' of
          type 'tree'; set node->base_expr with 'alt_base_expr' if it is
          not NULL.
          (name_expansions): New static variable.
          (get_alternative_base): New function.
          (alloc_cand_and_find_basis): Call get_alternative_base for
CAND_REF.
          Update calls to find_basis_for_candidate and
record_potential_basis.
          (execute_strength_reduction): Call free_affine_expand_cache with
          &name_expansions.

gcc/testsuite/

          * gcc.dg/tree-ssa/slsr-41.c: New test.


diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c
index 88afc91..30e3763 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "hash-table.h"
 #include "tree-ssa-address.h"
+#include "tree-affine.h"
 
 /* Information about a strength reduction candidate.  Each statement
    in the candidate table represents an expression of one of the
@@ -420,6 +421,31 @@ cand_chain_hasher::equal (const value_type *chain1, const compare_type *chain2)
 /* Hash table embodying a mapping from base exprs to chains of candidates.  */
 static hash_table <cand_chain_hasher> base_cand_map;
 
+/* Pointer map used by tree_to_aff_combination_expand.  */
+static struct pointer_map_t *name_expansions;
+
+/* Given BASE, use the tree affine combiniation facilities to
+   find the underlying tree expression for BASE, with any
+   immediate offset excluded.  */
+
+static tree
+get_alternative_base (tree base, double_int *offset)
+{
+  tree expr;
+  aff_tree aff;
+
+  tree_to_aff_combination_expand (base, TREE_TYPE (base),
+				  &aff, &name_expansions);
+  *offset = aff.offset;
+  aff.offset = tree_to_double_int (integer_zero_node);
+  expr = aff_combination_to_tree (&aff);
+
+  if (expr == base)
+    return NULL;
+  else
+    return expr;
+}
+
 /* Look in the candidate table for a CAND_PHI that defines BASE and
    return it if found; otherwise return NULL.  */
 
@@ -912,11 +938,11 @@ restructure_reference (tree *pbase, tree *poffset, double_int *pindex,
 static void
 slsr_process_ref (gimple gs)
 {
-  tree ref_expr, base, offset, type;
+  tree ref_expr, base, offset, type, alt_base;
   HOST_WIDE_INT bitsize, bitpos;
   enum machine_mode mode;
   int unsignedp, volatilep;
-  double_int index;
+  double_int index, delta;
   slsr_cand_t c;
 
   if (gimple_vdef (gs))
@@ -942,6 +968,16 @@ slsr_process_ref (gimple gs)
 
   /* Add the candidate to the statement-candidate mapping.  */
   add_cand_for_stmt (gs, c);
+
+  /* Add alternate interpretation.  */
+  if ((alt_base = get_alternative_base (base, &delta)))
+    {
+      slsr_cand_t c2 =
+	alloc_cand_and_find_basis (CAND_REF, gs, alt_base, index + delta,
+				   offset, type, 0);
+
+      c->next_interp = c2->cand_num;
+    }
 }
 
 /* Create a candidate entry for a statement GS, where GS multiplies
@@ -1802,6 +1838,80 @@ dump_incr_vec (void)
     }
 }
 
+/* Helper routine for preferred_ref_cand.  Given C which is a CAND_REF,
+   recursively count and return the number of dependents, including
+   itself.  */
+
+static int
+num_of_dependents (slsr_cand_t c)
+{
+  int n = 1;
+
+  if (c->sibling)
+    n += num_of_dependents (lookup_cand (c->sibling));
+
+  if (c->dependent)
+    n += num_of_dependents (lookup_cand (c->dependent));
+
+  return n;
+}
+
+/* Some of the memory accessing gimple statements have two CAND_REF
+   candidates as a result of an optional backtracing into the base
+   expr.  The routine checks and compares the two candidates, if both
+   exist; the candidate with a more dominating basis or the one
+   whose dependency graph has more nodes is returned.  In the case of
+   a draw, the candidate with the original base expr (primary) is
+   preferred to the backtraced one (secondary).  C is the CAND_REF
+   to be checked.
+
+   The whole idea is to avoid these gimple statements to be
+   replace_ref 'ed twice, and in a random order.  */
+
+static slsr_cand_t
+preferred_ref_cand (slsr_cand_t c)
+{
+  slsr_cand_t primary, secondary, theother;
+  slsr_cand_t *result
+    = (slsr_cand_t *) pointer_map_contains (stmt_cand_map,
+					    c->cand_stmt);
+  gcc_assert (result);
+
+  primary = *result;
+  if (primary->next_interp)
+    secondary = lookup_cand (primary->next_interp);
+  else
+    secondary = NULL;
+
+  gcc_assert (c == primary || c == secondary);
+  theother = c == primary ? secondary : primary;
+
+  if (theother)
+    {
+      /* An earlier basis exists!  The replacement may have
+	 already happened.  */
+      if (theother->basis != 0)
+	return theother;
+
+      if (theother->dependent != 0)
+	{
+	  int num_c = num_of_dependents (c);
+	  int num_t = num_of_dependents (theother);
+
+	  /* Fewer dependents, lower priority.  */
+	  if (num_c < num_t)
+	    return theother;
+
+	  /* When the numbers are the same, the primary candiate
+	     is preferred.  */
+	  if (num_c == num_t && theother == primary)
+	    return theother;
+	}
+    }
+
+  return c;
+}
+
 /* Replace *EXPR in candidate C with an equivalent strength-reduced
    data reference.  */
 
@@ -3453,7 +3563,20 @@ analyze_candidates_and_replace (void)
       /* If this is a chain of CAND_REFs, unconditionally replace
 	 each of them with a strength-reduced data reference.  */
       if (c->kind == CAND_REF)
-	replace_refs (c);
+	{
+	  slsr_cand_t t = preferred_ref_cand (c);
+
+	  if (t != c)
+	    {
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "\tProcessing skipped: "
+			 "higher-priority dependency tree is detected, "
+			 "where %d is chained.\n", t->cand_num);
+	      continue;
+	    }
+
+	  replace_refs (c);
+	}
 
       /* If the common stride of all related candidates is a known
 	 constant, each candidate without a phi-dependence can be
@@ -3539,6 +3662,8 @@ execute_strength_reduction (void)
       dump_cand_chains ();
     }
 
+  free_affine_expand_cache (&name_expansions);
+
   /* Analyze costs and make appropriate replacements.  */
   analyze_candidates_and_replace ();
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
new file mode 100644
index 0000000..870d714
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
@@ -0,0 +1,24 @@
+/* Verify straight-line strength reduction in using
+   alternative base expr to record and look for the
+   potential candidate.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+  int i, j;
+
+  i = v1 + 5;
+  j = i;
+  a2 [i-10] [j] = 2;
+  a2 [i] [j++] = i;
+  a2 [i+20] [j++] = i;
+  a2 [i-3] [i-1] += 1;
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */
diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c
index 88afc91..d069246 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "hash-table.h"
 #include "tree-ssa-address.h"
+#include "tree-affine.h"
 
 /* Information about a strength reduction candidate.  Each statement
    in the candidate table represents an expression of one of the
@@ -420,6 +421,42 @@ cand_chain_hasher::equal (const value_type *chain1, const compare_type *chain2)
 /* Hash table embodying a mapping from base exprs to chains of candidates.  */
 static hash_table <cand_chain_hasher> base_cand_map;
 
+/* Pointer map used by tree_to_aff_combination_expand.  */
+static struct pointer_map_t *name_expansions;
+/* Pointer map embodying a mapping from bases to alternative bases.  */
+static struct pointer_map_t *alt_base_map;
+
+/* Given BASE, use the tree affine combiniation facilities to
+   find the underlying tree expression for BASE, with any
+   immediate offset excluded.  */
+
+static tree
+get_alternative_base (tree base)
+{
+  tree *result = (tree *) pointer_map_contains (alt_base_map, base);
+
+  if (result == NULL)
+    {
+      tree expr;
+      aff_tree aff;
+
+      tree_to_aff_combination_expand (base, TREE_TYPE (base),
+				      &aff, &name_expansions);
+      aff.offset = tree_to_double_int (integer_zero_node);
+      expr = aff_combination_to_tree (&aff);
+
+      result = (tree *) pointer_map_insert (alt_base_map, base);
+      gcc_assert (!*result);
+
+      if (expr == base)
+	*result = NULL;
+      else
+	*result = expr;
+    }
+
+  return *result;
+}
+
 /* Look in the candidate table for a CAND_PHI that defines BASE and
    return it if found; otherwise return NULL.  */
 
@@ -439,9 +476,10 @@ find_phi_def (tree base)
   return c->cand_num;
 }
 
-/* Helper routine for find_basis_for_candidate.  May be called twice:
+/* Helper routine for find_basis_for_candidate.  May be called three times:
    once for the candidate's base expr, and optionally again for the
-   candidate's phi definition.  */
+   candidate's phi definition, as well as for an alternative base expr
+   in the case of CAND_REF.  */
 
 static slsr_cand_t
 find_basis_for_base_expr (slsr_cand_t c, tree base_expr)
@@ -518,6 +556,13 @@ find_basis_for_candidate (slsr_cand_t c)
 	}
     }
 
+  if (!basis && c->kind == CAND_REF)
+    {
+      tree alt_base_expr = get_alternative_base (c->base_expr);
+      if (alt_base_expr)
+	basis = find_basis_for_base_expr (c, alt_base_expr);
+    }
+
   if (basis)
     {
       c->sibling = basis->dependent;
@@ -528,17 +573,22 @@ find_basis_for_candidate (slsr_cand_t c)
   return 0;
 }
 
-/* Record a mapping from the base expression of C to C itself, indicating that
-   C may potentially serve as a basis using that base expression.  */
+/* Record a mapping from BASE to C, indicating that C may potentially serve
+   as a basis using that base expression.  BASE may be the same as
+   C->BASE_EXPR; alternatively BASE can be a different tree that share the
+   underlining expression of C->BASE_EXPR.  */
 
 static void
-record_potential_basis (slsr_cand_t c)
+record_potential_basis (slsr_cand_t c, tree base)
 {
   cand_chain_t node;
   cand_chain **slot;
 
+  if (base == NULL)
+    return;
+
   node = (cand_chain_t) obstack_alloc (&chain_obstack, sizeof (cand_chain));
-  node->base_expr = c->base_expr;
+  node->base_expr = base;
   node->cand = c;
   node->next = NULL;
   slot = base_cand_map.find_slot (node, INSERT);
@@ -554,10 +604,18 @@ record_potential_basis (slsr_cand_t c)
 }
 
 /* Allocate storage for a new candidate and initialize its fields.
-   Attempt to find a basis for the candidate.  */
+   Attempt to find a basis for the candidate.
+
+   For CAND_REF, an alternative base may also be recorded and used
+   to find a basis.  This helps cases where the expression hidden
+   behind BASE (which is usually an SSA_NAME) has immediate offset,
+   e.g.
+
+     a2[i][j] = 1;
+     a2[i + 20][j] = 2;  */
 
 static slsr_cand_t
-alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base, 
+alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base,
 			   double_int index, tree stride, tree ctype,
 			   unsigned savings)
 {
@@ -583,7 +641,9 @@ alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base,
   else
     c->basis = find_basis_for_candidate (c);
 
-  record_potential_basis (c);
+  record_potential_basis (c, base);
+  if (kind == CAND_REF)
+    record_potential_basis (c, get_alternative_base (base));
 
   return c;
 }
@@ -3524,6 +3584,9 @@ execute_strength_reduction (void)
   /* Allocate the mapping from base expressions to candidate chains.  */
   base_cand_map.create (500);
 
+  /* Allocate the mapping from bases to alternative bases.  */
+  alt_base_map = pointer_map_create ();
+
   /* Initialize the loop optimizer.  We need to detect flow across
      back edges, and this gives us dominator information as well.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
@@ -3539,6 +3602,9 @@ execute_strength_reduction (void)
       dump_cand_chains ();
     }
 
+  pointer_map_destroy (alt_base_map);
+  free_affine_expand_cache (&name_expansions);
+
   /* Analyze costs and make appropriate replacements.  */
   analyze_candidates_and_replace ();
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
new file mode 100644
index 0000000..870d714
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
@@ -0,0 +1,24 @@
+/* Verify straight-line strength reduction in using
+   alternative base expr to record and look for the
+   potential candidate.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+  int i, j;
+
+  i = v1 + 5;
+  j = i;
+  a2 [i-10] [j] = 2;
+  a2 [i] [j++] = i;
+  a2 [i+20] [j++] = i;
+  a2 [i-3] [i-1] += 1;
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]