This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[lno] Update the scalar evolutions algorithm


Hi,

This patch updates the scalar evolution algorithm.  The analysis is
about 3000 lines down from about 4500.  

I've also included the pass of checks elimination on which I'm
working, even if the compiler does not bootstraps with it enabled,
because I'm using it for testing the behaviour of the analyzer: see
testsuite/.../ssa-chrec-04.c and others testcases.  In order to
correct this optimization pass I have to modify the way the analyzer
deals with the types of the scalars it analyzes.

I'm attaching the documentation for both these passes.  A higher level
presentation of the extraction algorithm is available as a set of
slides:
  http://cri.ensmp.fr/~pop/gcc/feb04/slides.pdf

/* 
   Description: 
   
     This pass analyzes the evolution of scalar variables in loop
     structures.  The algorithm is based on the SSA representation,
     and on the loop hierarchy tree.  This algorithm is not based on
     the notion of versions of a variable, as it was the case for the
     previous implementations of the scalar evolution algorithm, but
     it assumes that each defined name is unique.
     
     A short sketch of the algorithm is:
     
     Given a scalar variable to be analyzed, follow the SSA edge to
     its definition:
     
     - When the definition is a MODIFY_EXPR: if the right hand side
     (RHS) of the definition cannot be statically analyzed, the answer
     of the analyzer is: "don't know", that corresponds to the
     conservative [-oo, +oo] element of the lattice of intervals.
     Otherwise, for all the variables that are not yet analyzed in the
     RHS, try to determine their evolution, and finally try to
     evaluate the operation of the RHS that gives the evolution
     function of the analyzed variable.

     - When the definition is a condition-phi-node: determine the
     evolution function for all the branches of the phi node, and
     finally merge these evolutions (see chrec_merge).

     - When the definition is a loop-phi-node: determine its initial
     condition, that is the SSA edge defined in an outer loop, and
     keep it symbolic.  Then determine the SSA edges that are defined
     in the body of the loop.  Follow the inner edges until ending on
     another loop-phi-node of the same analyzed loop.  If the reached
     loop-phi-node is not the starting loop-phi-node, then we keep
     this definition under a symbolic form.  If the reached
     loop-phi-node is the same as the starting one, then we compute a
     symbolic stride on the return path.  The result is then the
     symbolic chrec {initial_condition, +, symbolic_stride}_loop.

   Examples:
   
     Example 1: Illustration of the basic algorithm.
   
     | a = 3
     | loop_1
     |   b = phi (a, c)
     |   c = b + 1
     |   if (c > 10) exit_loop
     | endloop
   
     Suppose that we want to know the number of iterations of the
     loop_1.  The exit_loop is controlled by a COND_EXPR (c > 10).  We
     ask the scalar evolution analyzer two questions: what's the
     scalar evolution (scev) of "c", and what's the scev of "10".  For
     "10" the answer is "10" since it is a scalar constant.  For the
     scalar variable "c", it follows the SSA edge to its definition,
     "c = b + 1", and then asks again what's the scev of "b".
     Following the SSA edge, we end on a loop-phi-node "b = phi (a,
     c)", where the initial condition is "a", and the inner loop edge
     is "c".  The initial condition is kept under a symbolic form (it
     may be the case that the copy constant propagation has done its
     work and we end with the constant "3" as one of the edges of the
     loop-phi-node).  The update edge is followed to the end of the
     loop, and until reaching again the starting loop-phi-node: b -> c
     -> b.  At this point we have drawn a path from "b" to "b" from
     which we compute the stride in the loop: in this example it is
     "+1".  The resulting scev for "b" is "b -> {a, +, 1}_1".  Now
     that the scev for "b" is known, it is possible to compute the
     scev for "c", that is "c -> {a + 1, +, 1}_1".  In order to
     determine the number of iterations in the loop_1, we have to
     instantiate_parameters ({a + 1, +, 1}_1), that gives after some
     more analysis the scev {4, +, 1}_1, or in other words, this is
     the function "f (x) = x + 4", where x is the iteration count of
     the loop_1.  Now we have to solve the inequality "x + 4 > 10",
     and take the smallest iteration number for which the loop is
     exited: x = 7.  This loop runs from x = 0 to x = 7, and in total
     there are 8 iterations.  In terms of loop normalization, we have
     created a variable that is implicitly defined, "x" or just "_1",
     and all the other analyzed scalars of the loop are defined in
     function of this variable:
   
     a -> 3
     b -> {3, +, 1}_1
     c -> {4, +, 1}_1
     
     or in terms of a C program: 
     
     | a = 3
     | for (x = 0; x <= 7; x++)
     |   {
     |     b = x + 3
     |     c = x + 4
     |   }
     
     Example 2: Illustration of the algorithm on nested loops.
     
     | loop_1
     |   a = phi (1, b)
     |   c = a + 2
     |   loop_2  10 times
     |     b = phi (c, d)
     |     d = b + 3
     |   endloop
     | endloop
     
     For analyzing the scalar evolution of "a", the algorithm follows
     the SSA edge into the loop's body: "a -> b".  "b" is an inner
     loop-phi-node, and its analysis as in Example 1, gives: 
     
     b -> {c, +, 3}_2
     d -> {c + 3, +, 3}_2
     
     Following the SSA edge for the initial condition, we end on "c = a
     + 2", and then on the starting loop-phi-node "a".  From this point,
     the loop stride is computed: back on "c = a + 2" we get a "+2" in
     the loop_1, then on the loop-phi-node "b" we compute the overall
     effect of the inner loop that is "b = c + 30", and we get a "+30"
     in the loop_1.  That means that the overall stride in loop_1 is
     equal to "+32", and the result is: 
     
     a -> {1, +, 32}_1
     c -> {3, +, 32}_1
     
     Example 3: Higher degree polynomials.
     
     | loop_1
     |   a = phi (2, b)
     |   c = phi (5, d)
     |   b = a + 1
     |   d = c + a
     | endloop
     
     a -> {2, +, 1}_1
     b -> {3, +, 1}_1
     c -> {5, +, a}_1
     d -> {5 + a, +, a}_1
     
     instantiate_parameters ({5, +, a}_1) -> {5, +, 2, +, 1}_1
     instantiate_parameters ({5 + a, +, a}_1) -> {7, +, 3, +, 1}_1
     
     Example 4: Lucas, Fibonacci, or mixers in general.
     
     | loop_1
     |   a = phi (1, b)
     |   c = phi (3, d)
     |   b = c
     |   d = c + a
     | endloop
     
     a -> (1, c)_1
     c -> {3, +, a}_1
     
     The syntax "(1, c)_1" stands for a PEELED_CHREC that has the
     following semantics: during the first iteration of the loop_1, the
     variable contains the value 1, and then it contains the value "c".
     Note that this syntax is close to the syntax of the loop-phi-node:
     "a -> (1, c)_1" vs. "a = phi (1, c)".
     
     The symbolic chrec representation contains all the semantics of the
     original code.  What is more difficult is to use this information.
     
     Example 5: Flip-flops, or exchangers.
     
     | loop_1
     |   a = phi (1, b)
     |   c = phi (3, d)
     |   b = c
     |   d = a
     | endloop
     
     a -> (1, c)_1
     c -> (3, a)_1
     
     Based on these symbolic chrecs, it is possible to refine this
     information into the more precise PERIODIC_CHRECs: 
     
     a -> |1, 3|_1
     c -> |3, 1|_1
     
     This transformation is not yet implemented.
     
   Further readings:
   
     You can find a more detailed description of the algorithm in:
     http://icps.u-strasbg.fr/~pop/DEA_03_Pop.pdf
     http://icps.u-strasbg.fr/~pop/DEA_03_Pop.ps.gz.  But note that
     this is a preliminary report and some of the details of the
     algorithm have changed.  I'm working on a research report that
     updates the description of the algorithms to reflect the design
     choices used in this implementation.
     
   Fixmes:
   
     FIXME taylor: This FIXME concerns all the cases where we have to
     deal with additions of exponential functions: "exp + exp" or
     "poly + exp" or "cst + exp".  This could be handled by a Taylor
     decomposition of the exponential function, but this is still
     under construction (not implemented yet, or chrec_top).
     
     The idea is to represent the exponential evolution functions
     using infinite degree polynomials:
     
     | a -> {1, *, 2}_1 = {1, +, 1, +, 1, +, ...}_1 = {1, +, a}_1
     
     Proof:
     \begin{eqnarray*}
     \{1, *, t+1\} (x) &=& exp \left(log (1) + log (t+1) \binom{x}{1} \right) \\
                       &=& (t+1)^x \\
                       &=& \binom{x}{0} + \binom{x}{1}t + \binom{x}{2}t^2 + 
                           \ldots + \binom{x}{x}t^x \\
                       &=& \{1, +, t, +, t^2, +, \ldots, +, t^x\} \\
     \end{eqnarray*}
     
     While this equality is simple to prove for exponentials of degree
     1, it is still work in progress for higher degree exponentials.
*/

/* 
   Description:
   
     Compute the scalar evolutions for all the scalar variables of a
     condition expression, and based on this information performs a
     proof.  The condition is rewritten based on the result of this
     proof.

   Examples:
   
     Example 1: A simple illustration of the algorithm.
     
     Given the COND_EXPR "if (a < b)" with "a -> {2, +, 1}_1" and "b
     -> {3, +, 1}_1", the proof consists in comparing these evolution
     functions: is it always true for a given iteration x that "{2, +,
     1}_1 (x) < {3, +, 1}_1 (x)"?  The answer is yes, and the test of
     the condition is consequently replaced by "1".  

   Further readings:
   
     There is no further readings for the moment.  

     Based on the fact that this algorithm is similar to the Value
     Range Propagation you can have a look at the corresponding
     papers.  
*/


Changelog:

        * Makefile.in (OBJS-common): Add tree-elim-check.o.
        (tree-chrec.o): Add dependence on tree-pass.h.
        (tree-elim-check.o): New rule.
	* tree-elim-check.c: New file.
        * basic-block.h (edge_source, edge_destination): New inlined
        functions.
        * cfgloop.h (loop_nb_iterations): Added a comment on the use
        of this accessor.
        * common.opt (ftree-elim-checks): New flag.
        * flags.h (flag_tree_elim_checks): Declared here.
        * opts.c (decode_options): Set flag_tree_elim_checks to zero.
        (common_handle_option): Add case OPT_ftree_elim_checks.
        * timevar.def (TV_TREE_ELIM_CHECKS): Defined.
        * toplev.c (flag_tree_elim_checks): Defined.
        * tree-cfg.c (print_pred_bbs, print_succ_bbs, print_loop):
        Modify the dumping style.  Print nb_iterations.
        * tree-chrec.c, tree-chrec.h, tree-scalar-evolution.c,
        tree-scalar-evolution.h, tree-data-ref.c: New version.
        * tree-fold-const.c (tree_fold_bezout): Define.
        * tree-fold-const.h (tree_fold_int_round_div,
        tree_fold_int_trunc_mod, tree_fold_int_ceil_mod,
        tree_fold_int_floor_mod, tree_fold_int_round_mod): Removed, because
        not used for the moment.
        (chrec_merge_types): New function.
        * tree-optimize.c (pass_scev_elim_checks): Register the pass.
        * tree-pass.h (pass_scev_elim_checks): Declare the pass.
        * tree-pretty-print.c (dump_generic_node): Print
        PEELED_CHREC.  Remove PERIODIC_CHREC.
        * tree-vectorizer.c (): Modify the use of
        analyze_scalar_evolution.
        * tree.def (POLYNOMIAL_CHREC, EXPONENTIAL_CHREC): Store the
        evolution loop in a third leaf instead of in TREE_TYPE.
        TREE_TYPE is then used in storing the type of the chrec.
        (PERIODIC_CHREC): Removed since it is not used for the moment.
        (PEELED_CHREC): New node.
        * doc/invoke.texi (fdump-tree-scev, fdump-tree-ddall): Correct
        the name of these flags.
        (ftree-elim-checks, fdump-tree-elck): Document.
/* Elimination of redundant checks.
   Copyright (C) 2004 Free Software Foundation, Inc.
   Contributed by Sebastian Pop <sebastian.pop@cri.ensmp.fr>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.

GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING.  If not, write to the Free
Software Foundation, 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA.  */

/* 
   Description:
   
     Compute the scalar evolutions for all the scalar variables of a
     condition expression, and based on this information performs a
     proof.  The condition is rewritten based on the result of this
     proof.

   Examples:
   
     Example 1: A simple illustration of the algorithm.
     
     Given the COND_EXPR "if (a < b)" with "a -> {2, +, 1}_1" and "b
     -> {3, +, 1}_1", the proof consists in comparing these evolution
     functions: is it always true for a given iteration x that "{2, +,
     1}_1 (x) < {3, +, 1}_1 (x)"?  The answer is yes, and the test of
     the condition is consequently replaced by "1".  

   Further readings:
   
     There is no further readings for the moment.  

     Based on the fact that this algorithm is similar to the Value
     Range Propagation you can have a look at the corresponding
     papers.  
*/

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "errors.h"
#include "ggc.h"
#include "tree.h"

/* These RTL headers are needed for basic-block.h.  */
#include "rtl.h"
#include "basic-block.h"
#include "diagnostic.h"
#include "tree-flow.h"
#include "tree-dump.h"
#include "timevar.h"
#include "cfgloop.h"
#include "tree-fold-const.h"
#include "tree-chrec.h"
#include "tree-data-ref.h"
#include "tree-scalar-evolution.h"
#include "tree-pass.h"
#include "flags.h"

static void remove_redundant_check (tree, bool);
static void try_eliminate_check (tree);
static void scan_all_loops_r (struct loop *);

/* Remove the check by setting the condition COND to VALUE.  */

static void 
remove_redundant_check (tree cond, bool value)
{
  /* A dead COND_EXPR means the condition is dead. We don't change any
     flow, just replace the expression with a constant.  */
  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    fprintf (tree_dump_file, "Replacing one of the conditions.\n");

  if (value == true)
    COND_EXPR_COND (cond) = integer_one_node;
  
  else
    COND_EXPR_COND (cond) = integer_zero_node;
  
  modify_stmt (cond);
}

/* If the condition TEST is decidable at compile time, then eliminate
   the check.  */

static void
try_eliminate_check (tree cond)
{
  bool value;
  tree test, opnd0, opnd1;
  tree chrec0, chrec1;
  unsigned loop_nb = loop_num (loop_of_stmt (cond));

  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    {
      fprintf (tree_dump_file, "(try_eliminate_check \n");
      fprintf (tree_dump_file, "  (cond = ");
      print_generic_expr (tree_dump_file, cond, 0);
      fprintf (tree_dump_file, ")\n");
    }
  
  test = COND_EXPR_COND (cond);
  switch (TREE_CODE (test))
    {
    case SSA_NAME:
      /* Matched "if (opnd0)" ie, "if (opnd0 != 0)".  */
      opnd0 = test;
      chrec0 = analyze_scalar_evolution (loop_nb, opnd0);
      if (chrec_contains_undetermined (chrec0))
	break;
      chrec0 = instantiate_parameters (loop_nb, chrec0);
      
      if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
	{
	  fprintf (tree_dump_file, "  (test = ");
	  print_generic_expr (tree_dump_file, test, 0);
	  fprintf (tree_dump_file, ")\n  (loop_nb = %d)\n  (chrec0 = ", loop_nb);
	  print_generic_expr (tree_dump_file, chrec0, 0);
	  fprintf (tree_dump_file, ")\n");
	}
      
      if (prove_truth_value_ne (chrec0, integer_zero_node, &value))
	remove_redundant_check (cond, value);
      break;

    case LT_EXPR:
    case LE_EXPR:
    case GT_EXPR:
    case GE_EXPR:
    case EQ_EXPR:
    case NE_EXPR:
      opnd0 = TREE_OPERAND (test, 0);
      opnd1 = TREE_OPERAND (test, 1);
      chrec0 = analyze_scalar_evolution (loop_nb, opnd0);
      if (chrec_contains_undetermined (chrec0))
	break;
      
      chrec1 = analyze_scalar_evolution (loop_nb, opnd1);
      if (chrec_contains_undetermined (chrec1))
	break;
      
      chrec0 = instantiate_parameters (loop_nb, chrec0);
      chrec1 = instantiate_parameters (loop_nb, chrec1);
      
      if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
	{
	  fprintf (tree_dump_file, "  (test = ");
	  print_generic_expr (tree_dump_file, test, 0);
	  fprintf (tree_dump_file, ")\n  (loop_nb = %d)\n  (chrec0 = ", loop_nb);
	  print_generic_expr (tree_dump_file, chrec0, 0);
	  fprintf (tree_dump_file, ")\n  (chrec1 = ");
	  print_generic_expr (tree_dump_file, chrec1, 0);
	  fprintf (tree_dump_file, ")\n");
	}
      
      switch (TREE_CODE (test))
	{
	case LT_EXPR:
	  if (prove_truth_value_lt (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	  
	case LE_EXPR:
	  if (prove_truth_value_le (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	  
	case GT_EXPR:
	  if (prove_truth_value_gt (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	  
	case GE_EXPR:
	  if (prove_truth_value_ge (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	    
	case EQ_EXPR:
	  if (prove_truth_value_eq (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	  
	case NE_EXPR:
	  if (prove_truth_value_ne (chrec0, chrec1, &value))
	    remove_redundant_check (cond, value);
	  break;
	  
	default:
	  break;
	}
      break;
      
    default:
      break;
    }
  
  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    fprintf (tree_dump_file, ")\n");
}

/* Compute the exit edges for all the loops.  */

static void 
scan_all_loops_r (struct loop *loop)
{
  if (!loop)
    return;
  
  /* Recurse on the inner loops, then on the next (sibling) loops.  */
  scan_all_loops_r (inner_loop (loop));
  scan_all_loops_r (next_loop (loop));
  
  flow_loop_scan (loop, LOOP_EXIT_EDGES);
}

/* Walk over all the statements, searching for conditional statements.
   
   A better way to determine the conditional expressions that are good
   candidates for elimination would be needed.  For the moment
   systematically search the conditional expressions over the whole
   function.  */

void 
eliminate_redundant_checks (void)
{
  basic_block bb;
  block_stmt_iterator bsi;
  
  bb = BASIC_BLOCK (0);
  if (bb && bb->loop_father)
    {
      scan_all_loops_r (bb->loop_father);
      
      FOR_EACH_BB (bb)
	{
	  struct loop *loop = bb->loop_father;
	  
	  /* Don't try to prove anything about the loop exit
	     conditions: avoid the block that contains the condition
	     that guards the exit of the loop.  */
	  if (!loop_exit_edges (loop)
	      || edge_source (loop_exit_edge (loop, 0)) == bb)
	    continue;
	  
	  for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
	    {
	      tree expr = bsi_stmt (bsi);
	      
	      switch (TREE_CODE (expr))
		{
		case COND_EXPR:
		  try_eliminate_check (expr);
		  break;
		  
		default:
		  break;
		}
	    }
	}
    }
}

Attachment: testsuite.tar.gz
Description: Binary data

Attachment: scev3.diff.gz
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]