[PATCH 3/5] IPA ICF pass

Trevor Saunders tsaunders@mozilla.com
Fri Jun 20 07:32:00 GMT 2014


On Fri, Jun 13, 2014 at 12:44:22PM +0200, mliska wrote:
> Hello,
>    this is core of IPA ICF patchset. It adds new pass and registers all needed stuff related to a newly introduced interprocedural optimization.
> 
> Algorithm description:
>   In LGEN, we visit all read-only variables and functions. For each symbol, a hash value based on e.g. number of arguments,
>   number of BB, GIMPLE CODES is computed (similar hash is computed for read-only variables). This kind of information is streamed
>   for LTO.
> 
>   In WPA, we build congruence classes for all symbols having a same hash value. For functions, these classes are subdivided in WPA by argument type comparison. Each reference (a call or a variable reference) to another semantic item candidate is marked and stored for further congruence class reduction (similar algorithm as Value Numbering:  www.cs.ucr.edu/~gupta/teaching/553-07/Papers/value.pdf).
> 
>   For every congruence class of functions with more than one semantic function, we load function body. Having this information, we can
>   process complete semantic function equality and subdivide such congruence class. Read-only variable class members are also deeply compared.
> 
>   After that, we process Value numbering algorithm to do a final subdivision. Finally, all items belonging to a congruence class with more than one
>   item are merged.
> 
> Martin
> 
> Changelog:
> 
> 2014-06-13  Martin Liska  <mliska@suse.cz>
> 	    Jan Hubicka  <hubicka@ucw.cz>
> 
> 	* Makefile.in: New pass object file added.
> 	* common.opt: New -fipa-icf flag introduced.
> 	* doc/invoke.texi: Documentation enhanced for the pass.
> 	* lto-section-in.c: New LTO section for a summary created by IPA-ICF.
> 	* lto-streamer.h: New section name introduced.
> 	* opts.c: Optimization is added to -O2.
> 	* passes.def: New pass added.
> 	* timevar.def: New time var for IPA-ICF.
> 	* tree-pass.h: Pass construction function.
> 	* ipa-icf.h: New pass header file added.
> 	* ipa-icf.c: New pass source file added.
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 5587b75..4b59418 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1276,6 +1276,7 @@ OBJS = \
>  	ipa-profile.o \
>  	ipa-prop.o \
>  	ipa-pure-const.o \
> +	ipa-icf.o \
>  	ipa-reference.o \
>  	ipa-ref.o \
>  	ipa-utils.o \
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 7f05092..3661dcc 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1409,6 +1409,10 @@ fipa-pure-const
>  Common Report Var(flag_ipa_pure_const) Init(0) Optimization
>  Discover pure and const functions
>  
> +fipa-icf
> +Common Report Var(flag_ipa_icf) Optimization
> +Perform Identical Code Folding for functions and read-only variables
> +
>  fipa-reference
>  Common Report Var(flag_ipa_reference) Init(0) Optimization
>  Discover readonly and non addressable static variables
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 3c02341..b2bbe69 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -377,7 +377,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone @gol
> --fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol
> +-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-icf @gol
>  -fira-algorithm=@var{algorithm} @gol
>  -fira-region=@var{region} -fira-hoist-pressure @gol
>  -fira-loop-pressure -fno-ira-share-save-slots @gol
> @@ -6947,6 +6947,7 @@ also turns on the following optimization flags:
>  -finline-small-functions @gol
>  -findirect-inlining @gol
>  -fipa-sra @gol
> +-fipa-icf @gol
>  -fisolate-erroneous-paths-dereference @gol
>  -foptimize-sibling-calls @gol
>  -fpartial-inlining @gol
> @@ -7862,6 +7863,14 @@ it may significantly increase code size
>  (see @option{--param ipcp-unit-growth=@var{value}}).
>  This flag is enabled by default at @option{-O3}.
>  
> +@item -fipa-icf
> +@opindex fipa-icf
> +Perform Identical Code Folding for functions and read-only variables.
> +Behavior is similar to Gold Linker ICF optimization. Symbols proved
> +as semantically equivalent are redirected to corresponding symbol. The pass
> +sensitively decides for usage of alias, thunk or local redirection.
> +This flag is enabled by default at @option{-O2}.
> +
>  @item -fisolate-erroneous-paths-dereference
>  Detect paths which trigger erroneous or undefined behaviour due to
>  dereferencing a NULL pointer.  Isolate those paths from the main control
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> new file mode 100644
> index 0000000..628d257
> --- /dev/null
> +++ b/gcc/ipa-icf.c
> @@ -0,0 +1,3247 @@
> +/* Interprocedural Identical Code Folding pass
> +   Copyright (C) 2014 Free Software Foundation, Inc.
> +
> +   Contributed by Jan Hubicka <hubicka@ucw.cz> and Martin Liska <mliska@suse.cz>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* Interprocedural Identical Code Folding for functions and
> +   read-only variables.
> +
> +   The goal of this transformation is to discover functions and read-only
> +   variables which do have exactly the same semantics.
> +
> +   In case of functions,
> +   we could either create a virtual clone or do a simple function wrapper
> +   that will call equivalent function. If the function is just locally visible,
> +   all function calls can be redirected. For read-only variables, we create
> +   aliases if possible.

In theory you don't actually need the alias, you could change all the
references to point at the merged data right?  Can that automatically
happen for hidden visibility alias?

> +
> +   Optimization pass arranges as follows:
> +   1) All functions and read-only variables are visited and internal
> +      data structure, either sem_function or sem_variables is created.
> +   2) For every symbol from the previoues step, VAR_DECL and FUNCTION_DECL are

previous

> +      saved and matched to corresponding sem_items.
> +   3) These declaration are ignored for equality check and are solved
> +      by Value Numbering algorithm published by Alpert, Zadeck in 1992.
> +   4) We compute hash value for each symbol.
> +   5) Congruence classes are created based on hash value. If hash value are
> +      equal, equals function is called and symbols are deeply compared.
> +      We must prove that all SSA names, declarations and other items
> +      correspond.
> +   6) Value Numbering is executed for these classes. At the end of the process
> +      all symbol members in remaining classes can be mrged.
merged

> +   7) Merge operation creates alias in case of read-only variables. For
> +      callgraph node, we must decide if we can redirect local calls,
> +      create an alias or a thunk.
> +
> +*/
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "basic-block.h"
> +#include "tree-ssa-alias.h"
> +#include "internal-fn.h"
> +#include "gimple-expr.h"
> +#include "is-a.h"
> +#include "gimple.h"
> +#include "expr.h"
> +#include "gimple-iterator.h"
> +#include "gimple-ssa.h"
> +#include "tree-cfg.h"
> +#include "tree-phinodes.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-dfa.h"
> +#include "tree-pass.h"
> +#include "gimple-pretty-print.h"
> +#include "ipa-inline.h"
> +#include "cfgloop.h"
> +#include "except.h"
> +#include "hash-table.h"
> +#include "coverage.h"
> +#include "pointer-set.h"
> +#include "attribs.h"
> +#include "print-tree.h"
> +#include "lto-streamer.h"
> +#include "data-streamer.h"
> +#include "ipa-utils.h"
> +#include "ipa-icf.h"
> +
> +namespace {
> +
> +func_checker::func_checker (): initialized (false)
> +{
> +}

that style seems weird I'd expect

function_checker::function_checker () :
  initialized (false)
  {
  }

  but really why isn't that inline in the class, it seems really short.

> +
> +/* Itializes internal structures according to given number of

initializes

> +   source and target SSA names. The number of source names is SSA_SOURCE,
> +   respectively SSA_TARGET.  */
> +
> +void
> +func_checker::initialize (unsigned ssa_source, unsigned ssa_target)
> +{
> +  release ();
> +  initialized = true;
> +
> +  source_ssa_names.create (ssa_source);
> +  target_ssa_names.create (ssa_target);
> +
> +  for (unsigned int i = 0; i < ssa_source; i++)
> +    source_ssa_names.safe_push (-1);
> +
> +  for (unsigned int i = 0; i < ssa_target; i++)
> +    target_ssa_names.safe_push (-1);
> +
> +  edge_map = new pointer_map <edge> ();
> +
> +  decl_map = new pointer_map <tree> ();
> +}

So, if its infallible why do you need a seperate init function?

> +
> +/* Memory release routine.  */
> +
> +void
> +func_checker::release (void)
> +{
> +  if (!initialized)
> +    return;
> +
> +  delete edge_map;
> +  delete decl_map;
> +  source_ssa_names.release();
> +  target_ssa_names.release();
> +}

obligatory why isn't this a dtor?

> +
> +/* Verifies that trees T1 and T2 do correspond.  */
> +
> +bool
> +func_checker::compare_ssa_name (tree t1, tree t2)
> +{
> +  unsigned i1 = SSA_NAME_VERSION (t1);
> +  unsigned i2 = SSA_NAME_VERSION (t2);
> +
> +  if (source_ssa_names[i1] == -1)
> +    source_ssa_names[i1] = i2;
> +  else if (source_ssa_names[i1] != (int) i2)
> +    return false;
> +
> +  if(target_ssa_names[i2] == -1)
> +    target_ssa_names[i2] = i1;
> +  else if (target_ssa_names[i2] != (int) i1)
> +    return false;
> +
> +  return true;
> +}
> +
> +/* Verification function for edges E1 and E2.  */
> +
> +bool
> +func_checker::compare_edge (edge e1, edge e2)
> +{
> +  if (e1->flags != e2->flags)
> +    return false;
> +
> +  edge *slot = edge_map->contains (e1);
> +  if (slot)
> +    {
> +      SE_EXIT_DEBUG (*slot == e2);
> +    }
> +  else
> +    {
> +      slot = edge_map->insert (e1);
> +      *slot = e2;
> +    }

it'd be more efficient to just call insert and pass &existed to check if
the element was previously there.  It'd always be 1 hash lookup not
sometimes 2.

> +/* Verification function for declaration trees T1 and T2 that

its more comparison than verification right?

> +func_checker::compare_decl (tree t1, tree t2, tree func1, tree func2)
> +{
> +  if (!auto_var_in_fn_p (t1, func1) || !auto_var_in_fn_p (t2, func2))
> +    SE_EXIT_DEBUG (t1 == t2);
> +
> +  if (!types_compatible_p (TREE_TYPE (t1), TREE_TYPE (t2)))
> +    SE_EXIT_FALSE ();
> +
> +  tree *slot = decl_map->contains (t1);
> +  if (slot)
> +    {
> +      SE_EXIT_DEBUG (*slot == t2);
> +    }
> +  else
> +    {
> +      slot = decl_map->insert (t1);
> +      *slot = t2;
> +    }

same

> +congruence_class::congruence_class (unsigned int _id): id(_id)
> +{
> +  members.create (2);

hrm, we should probably add a specialization of auto_vec<T, 0> so you
can use that for things like this and lose the weird .create () stuff.

> +}
> +
> +/* Constructor for key value pair, where _ITEM is key and _INDEX is a target.  */
> +
> +sem_usage_pair::sem_usage_pair (sem_item *_item, unsigned int _index):
> +  item (_item), index (_index)

we've been naming members m_foo which should solve the shadowing problem
here as well as making code more readable.

> +sem_item::~sem_item ()
> +{
> +  if (tree_refs_set)
> +    pointer_set_destroy (tree_refs_set);
> +
> +  for (unsigned i = 0; i < usages.length (); i++)
> +    delete usages[i];

what about the vectors themselves? who owns them? I know this bit of
vec<> is horrible :(

> +/* Compare two types if are same aliases in case of strict aliasing
> +   is enabled.  */

nit, bad grammer

> +bool
> +sem_item::compare_for_aliasing (tree t1, tree t2)
> +{
> +  if (flag_strict_aliasing)
> +    {
> +      alias_set_type s1 = get_deref_alias_set (TREE_TYPE (t1));
> +      alias_set_type s2 = get_deref_alias_set (TREE_TYPE (t2));
> +
> +      return s1 == s2;
> +    }
> +
> +  return true;
> +}
> +
> +/* Semantic function constructor that uses STACK as bitmap memory stack.  */
> +
> +sem_function::sem_function (bitmap_obstack *stack): sem_item (FUNC, stack),
> +  compared_func (NULL)
> +{
> +  arg_types.create (0);
> +}
> +
> +/*  Constructor based on callgraph node _NODE with computed hash _HASH.
> +    Bitmap STACK is used for memory allocation.  */
> +sem_function::sem_function (cgraph_node *node, hashval_t hash,
> +			    bitmap_obstack *stack):
> +  sem_item (FUNC, node, hash, stack), bb_sorted (NULL), compared_func (NULL)
> +{
> +  arg_types.create (0);
> +}
> +
> +sem_function::~sem_function ()
> +{
> +  if (!bb_sorted)
> +    return;
> +
> +  for (unsigned i = 0; i < bb_count; i++)
> +    free (bb_sorted[i]);
> +
> +  free (bb_sizes);
> +  free (bb_sorted);
> +}

who owns arg_types? it doesn't seem to be us

> +sem_function::get_hash (void)
> +{
> +  if(!hash)
> +    {
> +      hash = 177454; /* Random number for function type.  */
> +
> +      hash = iterative_hash_object (arg_count, hash);
> +      hash = iterative_hash_object (bb_count, hash);
> +      hash = iterative_hash_object (edge_count, hash);
> +      hash = iterative_hash_object (cfg_checksum, hash);
> +      hash = iterative_hash_object (gcode_hash, hash);
> +
> +      for (unsigned i = 0; i < bb_count; i++)
> +	hash = iterative_hash_object (hash, get_bb_hash (bb_sorted[i]));
> +
> +      for (unsigned i = 0; i < bb_count; i++)
> +	hash = iterative_hash_object (bb_sizes[i], hash);
> +    }
> +
> +  return hash;
> +}

I have to say I've always wondered if our hash functions try too hard to
be perfect hashes at the expensive of speed, but of course its hard to
know without lots of data.

> +
> +/* Fast equality function based on knowledge known in WPA.  */
> +
> +bool
> +sem_function::equals_wpa (sem_item *item)
> +{
> +  if (item->type != FUNC)
> +    return false;

should this just be an assert? When we're creating the congruence
classes in the first place we can keep functions and variables in
different classes right?

> +/* Returns true if the item equals to ITEM given as arguemnt.  */

argument

> +sem_function::equals_private (sem_item *item)
> +{
> +  if (item->type != FUNC)
> +    return false;

assert?

> +
> +  basic_block bb1, bb2;
> +  edge e1, e2;
> +  edge_iterator ei1, ei2;
> +  int *bb_dict = NULL;
> +  bool result = true;
> +  tree arg1, arg2;
> +
> +  compared_func = static_cast<sem_function *> (item);
> +
> +  gcc_assert (decl != item->decl);
> +
> +  if (arg_count != compared_func->arg_count
> +      || bb_count != compared_func->bb_count
> +      || edge_count != compared_func->edge_count
> +      || cfg_checksum != compared_func->cfg_checksum)
> +    SE_EXIT_FALSE();
> +
> +  if (!equals_wpa (item))

isn't this redundant with a bunch of the checks above it? could you
remove them?

> +    return false;
> +
> +  /* Checking function arguments.  */
> +  tree decl1 = DECL_ATTRIBUTES (decl);
> +  tree decl2 = DECL_ATTRIBUTES (compared_func->decl);
> +
> +  while (decl1)
> +    {
> +      if (decl2 == NULL)
> +	SE_EXIT_FALSE();

are attributes always in the same order?

> +
> +      if (get_attribute_name (decl1) != get_attribute_name (decl2))
> +	SE_EXIT_FALSE();
> +
> +      tree attr_value1 = TREE_VALUE (decl1);
> +      tree attr_value2 = TREE_VALUE (decl2);
> +
> +      if (attr_value1 && attr_value2)
> +	{
> +	  bool ret = compare_operand (TREE_VALUE (attr_value1),
> +				      TREE_VALUE (attr_value2), decl,
> +				      compared_func->decl);
> +	  if (!ret)
> +	    SE_EXIT_FALSE_WITH_MSG ("attribute values are different")
> +	  }
> +      else if (!attr_value1 && !attr_value2)
> +	{}
> +      else
> +	SE_EXIT_FALSE ();
> +
> +      decl1 = TREE_CHAIN (decl1);
> +      decl2 = TREE_CHAIN (decl2);
> +    }
> +
> +  if (decl1 != decl2)
> +    SE_EXIT_FALSE();
> +
> +  checker.initialize (ssa_names_size, compared_func->ssa_names_size);
> +
> +  for (arg1 = DECL_ARGUMENTS (decl), arg2 = DECL_ARGUMENTS (compared_func->decl);
> +       arg1; arg1 = DECL_CHAIN (arg1), arg2 = DECL_CHAIN (arg2))
> +    checker.compare_decl (arg1, arg2, decl, compared_func->decl);
> +
> +  /* Exception handling regions comparison.  */
> +  if (!compare_eh_region (region_tree, compared_func->region_tree, decl,
> +			  compared_func->decl))
> +    SE_EXIT_FALSE();
> +
> +  /* Checking all basic blocks.  */
> +  for (unsigned i = 0; i < bb_count; ++i)
> +    if(!compare_bb (bb_sorted[i], compared_func->bb_sorted[i], decl,
> +		    compared_func->decl))
> +      SE_EXIT_FALSE();
> +
> +  SE_DUMP_MESSAGE ("All BBs are equal\n");
> +
> +  /* Basic block edges check.  */
> +  for (unsigned i = 0; i < bb_count; ++i)
> +    {
> +      bb_dict = XNEWVEC (int, bb_count + 2);
> +      memset (bb_dict, -1, (bb_count + 2) * sizeof (int));
> +
> +      bb1 = bb_sorted[i]->bb;
> +      bb2 = compared_func->bb_sorted[i]->bb;
> +
> +      ei2 = ei_start (bb2->preds);
> +
> +      for (ei1 = ei_start (bb1->preds); ei_cond (ei1, &e1); ei_next (&ei1))
> +	{
> +	  ei_cond (ei2, &e2);
> +
> +	  if (!bb_dict_test (bb_dict, e1->src->index, e2->src->index))
> +	    SE_EXIT_FALSE_WITH_MSG("edge comparison returns false");
> +
> +	  if (!bb_dict_test (bb_dict, e1->dest->index, e2->dest->index))
> +	    SE_EXIT_FALSE_WITH_MSG("BB comparison returns false");
> +
> +	  if (e1->flags != e2->flags)
> +	    SE_EXIT_FALSE_WITH_MSG("flags comparison returns false");

check this condition before the previous two since I guess its faster?

> +sem_function::init_refs_for_tree (tree t)
> +{
> +  switch (TREE_CODE (t))
> +    {
> +    case VAR_DECL:
> +    case FUNCTION_DECL:
> +      tree_refs.safe_push (t);
> +      break;
> +    case MEM_REF:
> +    case ADDR_EXPR:
> +    case OBJ_TYPE_REF:
> +      init_refs_for_tree (TREE_OPERAND (t, 0));
> +      break;
> +    case FIELD_DECL:
> +      init_refs_for_tree (DECL_FCONTEXT (t));
> +      break;
> +    default:
> +      break;
> +    }
> +}
> +
> +/* Initializes references to another sem_item for gimple STMT of type assign.  */
> +
> +void
> +sem_function::init_refs_for_assign (gimple stmt)
> +{
> +  if (gimple_num_ops (stmt) != 2)
> +    return;
> +
> +  tree rhs = gimple_op (stmt, 1);
> +
> +  init_refs_for_tree (rhs);
> +}
> +
> +/* Initializes references to other semantic functions/variables.  */
> +
> +void
> +sem_function::init_refs (void)
> +{
> +  for (unsigned i = 0; i < bb_count; i++)
> +    {
> +      basic_block bb = bb_sorted[i]->bb;
> +
> +      for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
> +	   gsi_next (&gsi))
> +	{
> +	  gimple stmt = gsi_stmt (gsi);
> +	  hashval_t code = (hashval_t) gimple_code (stmt);
> +
> +	  switch (code)
> +	    {
> +	    case GIMPLE_CALL:
> +	      {
> +		tree funcdecl = gimple_call_fndecl (stmt);
> +
> +		/* Function pointer variables are not support yet.  */

supported

> +		if (funcdecl)
> +		  tree_refs.safe_push (funcdecl);
> +
> +		break;
> +	      }
> +	    case GIMPLE_ASSIGN:
> +	      init_refs_for_assign (stmt);
> +	      break;
> +	    default:
> +	      break;
> +	    }
> +	}
> +    }
> +}
> +
> +/* Merges instance with an ALIAS_ITEM, where alias, thunk or redirection can
> +   be applied.  */
> +bool
> +sem_function::merge (sem_item *alias_item)
> +{
> +  gcc_assert (alias_item->type == FUNC);
> +
> +  sem_function *alias_func = static_cast<sem_function *> (alias_item);
> +
> +  struct cgraph_node *original = get_node ();
> +  struct cgraph_node *local_original = original;
> +  struct cgraph_node *alias = alias_func->get_node ();
> +  bool original_address_matters;
> +  bool alias_address_matters;
> +
> +  bool create_thunk = false;
> +  bool create_alias = false;
> +  bool redirect_callers = false;
> +  bool original_discardable = false;
> +
> +  /* Do not attempt to mix functions from different user sections;
> +     we do not know what user intends with those.  */
> +  if (((DECL_SECTION_NAME (original->decl) && !original->implicit_section)
> +       || (DECL_SECTION_NAME (alias->decl) && !alias->implicit_section))
> +      && DECL_SECTION_NAME (original->decl) != DECL_SECTION_NAME (alias->decl))
> +    {
> +      if (dump_file)
> +	fprintf (dump_file,
> +		 "Not unifying; original and alias are in different sections.\n\n");
> +      return false;
> +    }
> +
> +  /* See if original is in a section that can be discarded if the main
> +     symbol is not used.  */
> +  if (DECL_EXTERNAL (original->decl))
> +    original_discardable = true;
> +  if (original->resolution == LDPR_PREEMPTED_REG
> +      || original->resolution == LDPR_PREEMPTED_IR)
> +    original_discardable = true;
> +  if (symtab_can_be_discarded (original))
> +    original_discardable = true;

shouldn't this last check handle the previous ones?

> +
> +  /* See if original and/or alias address can be compared for equality.  */
> +  original_address_matters
> +    = (!DECL_VIRTUAL_P (original->decl)
> +       && (original->externally_visible
> +	   || address_taken_from_non_vtable_p (original)));
> +  alias_address_matters
> +    = (!DECL_VIRTUAL_P (alias->decl)
> +       && (alias->externally_visible
> +	   || address_taken_from_non_vtable_p (alias)));
> +
> +  /* If alias and original can be compared for address equality, we need
> +     to create a thunk.  Also we can not create extra aliases into discardable
> +     section (or we risk link failures when section is discarded).  */
> +  if ((original_address_matters
> +       && alias_address_matters)
> +      || original_discardable)
> +    {
> +      create_thunk = !stdarg_p (TREE_TYPE (alias->decl));
> +      create_alias = false;
> +      /* When both alias and original are not overwritable, we can save
> +         the extra thunk wrapper for direct calls.  */
> +      redirect_callers
> +	= (!original_discardable
> +	   && cgraph_function_body_availability (alias) > AVAIL_OVERWRITABLE
> +	   && cgraph_function_body_availability (original) > AVAIL_OVERWRITABLE);
> +    }
> +  else
> +    {
> +      create_alias = true;
> +      create_thunk = false;
> +      redirect_callers = false;
> +    }
> +
> +  if (create_alias && DECL_COMDAT_GROUP (alias->decl))
> +    {
> +      create_alias = false;
> +      create_thunk = true;
> +    }
> +
> +  /* We want thunk to always jump to the local function body
> +     unless the body is comdat and may be optimized out.  */
> +  if ((create_thunk || redirect_callers)
> +      && (!original_discardable
> +	  || (DECL_COMDAT_GROUP (original->decl)
> +	      && (DECL_COMDAT_GROUP (original->decl)
> +		  == DECL_COMDAT_GROUP (alias->decl)))))
> +    local_original
> +      = cgraph (symtab_nonoverwritable_alias (original));
> +
> +  if (redirect_callers)
> +    {
> +      /* If alias is non-overwritable then
> +         all direct calls are safe to be redirected to the original.  */
> +      bool redirected = false;
> +      while (alias->callers)
> +	{
> +	  struct cgraph_edge *e = alias->callers;
> +	  cgraph_redirect_edge_callee (e, local_original);
> +	  push_cfun (DECL_STRUCT_FUNCTION (e->caller->decl));
> +
> +	  if (e->call_stmt)
> +	    cgraph_redirect_edge_call_stmt_to_callee (e);
> +
> +	  pop_cfun ();
> +	  redirected = true;
> +	}
> +
> +      /* The alias function is removed just if symbol address

s/just//

> +         does not matters.  */

s/matters/matter/

> +sem_function *
> +sem_function::parse (struct cgraph_node *node, bitmap_obstack *stack)
> +{
> +  tree fndecl = node->decl;
> +  struct function *func = DECL_STRUCT_FUNCTION (fndecl);
> +
> +  if (!func || !cgraph_function_with_gimple_body_p (node))
> +    return NULL;
> +
> +  if (lookup_attribute_by_prefix ("omp ", DECL_ATTRIBUTES (node->decl)) != NULL)
> +    return NULL;
> +
> +  sem_function *f = new sem_function (node, 0, stack);
> +
> +  f->init ();
> +
> +  return f;
> +}
> +
> +/* Parses function arguments and result type.  */
> +
> +void
> +sem_function::parse_tree_args (void)
> +{
> +  tree result;
> +  arg_types.create (4);
> +  tree fnargs = DECL_ARGUMENTS (decl);
> +
> +  for (tree parm = fnargs; parm; parm = DECL_CHAIN (parm))
> +    arg_types.safe_push (TYPE_CANONICAL (DECL_ARG_TYPE (parm)));
> +
> +  /* Function result type.  */
> +  result = DECL_RESULT (decl);
> +  result_type = result ? TYPE_CANONICAL (TREE_TYPE (result)) : NULL;
> +
> +  /* During WPA, we can get arguments by following method.  */
> +  if (!fnargs)
> +    {
> +      tree type = TYPE_ARG_TYPES (TREE_TYPE (decl));
> +      for (tree parm = type; parm; parm = TREE_CHAIN (parm))
> +	arg_types.safe_push (TYPE_CANONICAL (TREE_VALUE (parm)));
> +
> +      result_type = TREE_TYPE (TREE_TYPE (decl));
> +    }
> +
> +  arg_count = arg_types.length ();

So, is it really worth the extra memory to save a memory access when you
need the number of arguments? especially considering you'll probably
want the vector elements next.

> +/* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
> +   true value is returned if phi nodes are sematically

semantically

> +   equivalent in these blocks .  */

perhaps return true if .... ?

> +sem_function::gsi_next_nondebug_stmt (gimple_stmt_iterator &gsi)

 I'd say a pointer makes it clearler the callie will modify the argument.

> +{
> +  gimple s;
> +
> +  s = gsi_stmt (gsi);
> +
> +  while (gimple_code (s) == GIMPLE_DEBUG)
> +    {
> +      gsi_next (&gsi);
> +      gcc_assert (!gsi_end_p (gsi));
> +
> +      s = gsi_stmt (gsi);
> +    }
> +}

really, shouldn't this be part of the gsi interface not something you
hand roll here?

> +
> +/* Iterates GSI statement iterator to the next non-virtual statement.  */
> +
> +void
> +sem_function::gsi_next_nonvirtual_phi (gimple_stmt_iterator &it)

same

> +sem_function::icf_handled_component_p (tree t)
> +{
> +  enum tree_code tc = TREE_CODE (t);
> +
> +  return ((handled_component_p (t) && handled_component_p (t))

it must be too late, or are you calling the same function twice with the
same args?

> +	  || tc == ADDR_EXPR || tc == MEM_REF || tc == REALPART_EXPR
> +	  || tc == IMAGPART_EXPR || tc == OBJ_TYPE_REF);
> +}
> +/* Returns true if the item equals to ITEM given as arguemnt.  */

argument

Its nice to see someone trying this and better it works ;)  I wonder if
in the future it'll be worth looking at congruity of chunks of functions
and then sharing those too.

anyway I've had it for tonight hopefully I can finish looking at this
soon.

Trev

> +sem_item_optimizer::write_summary (void)
> +{
> +  unsigned int count = 0;
> +
> +  struct output_block *ob = create_output_block (LTO_section_ipa_icf);
> +  lto_symtab_encoder_t encoder = ob->decl_state->symtab_node_encoder;
> +  ob->cgraph_node = NULL;
> +
> +  /* Calculate number of symbols to be serialized.  */
> +  for (lto_symtab_encoder_iterator lsei = lsei_start_in_partition (encoder);
> +       !lsei_end_p (lsei);
> +       lsei_next_in_partition (&lsei))
> +    {
> +      struct symtab_node *node = lsei_node (lsei);
> +
> +      if (symtab_node_map.contains (node))
> +	count++;
> +    }
> +
> +  streamer_write_uhwi (ob, count);
> +
> +  /* Process all of the symbols.  */
> +  for (lto_symtab_encoder_iterator lsei = lsei_start_in_partition (encoder);
> +       !lsei_end_p (lsei);
> +       lsei_next_in_partition (&lsei))
> +    {
> +      struct symtab_node *node = lsei_node (lsei);
> +
> +      sem_item **item = symtab_node_map.contains (node);
> +
> +      if (item && *item)
> +	{
> +	  int node_ref = lto_symtab_encoder_encode (encoder, node);
> +	  streamer_write_uhwi_stream (ob->main_stream, node_ref);
> +
> +	  streamer_write_uhwi (ob, (*item)->get_hash ());
> +	}
> +    }
> +
> +  streamer_write_char_stream (ob->main_stream, 0);
> +  produce_asm (ob, NULL);
> +  destroy_output_block (ob);
> +}
> +
> +/* Reads a section from LTO stream file FILE_DATA. Input block for DATA
> +   contains LEN bytes.  */
> +
> +void
> +sem_item_optimizer::read_section (struct lto_file_decl_data *file_data,
> +				  const char *data, size_t len)
> +{
> +  const struct lto_function_header *header =
> +  (const struct lto_function_header *) data;
> +  const int cfg_offset = sizeof (struct lto_function_header);
> +  const int main_offset = cfg_offset + header->cfg_size;
> +  const int string_offset = main_offset + header->main_size;
> +  struct data_in *data_in;
> +  struct lto_input_block ib_main;
> +  unsigned int i;
> +  unsigned int count;
> +
> +  LTO_INIT_INPUT_BLOCK (ib_main, (const char *) data + main_offset, 0,
> +			header->main_size);
> +
> +  data_in =
> +    lto_data_in_create (file_data, (const char *) data + string_offset,
> +			header->string_size, vNULL);
> +
> +  count = streamer_read_uhwi (&ib_main);
> +
> +  for (i = 0; i < count; i++)
> +    {
> +      unsigned int index;
> +      struct symtab_node *node;
> +      lto_symtab_encoder_t encoder;
> +
> +      index = streamer_read_uhwi (&ib_main);
> +      encoder = file_data->symtab_node_encoder;
> +      node = lto_symtab_encoder_deref (encoder, index);
> +
> +      hashval_t hash = streamer_read_uhwi (&ib_main);
> +
> +      gcc_assert (node->definition);
> +
> +      if (dump_file)
> +	fprintf (dump_file, "Symbol added:%s (tree: %p, uid:%u)\n", node->asm_name (),
> +		 (void *) node->decl, node->order);
> +
> +      if (is_a_helper<cgraph_node *>::test (node))
> +	{
> +	  cgraph_node *cnode = cgraph (node);
> +
> +	  items.safe_push (new sem_function (cnode, hash, &bmstack));
> +	}
> +      else
> +	{
> +	  varpool_node *vnode = varpool (node);
> +
> +	  items.safe_push (new sem_variable (vnode, hash, &bmstack));
> +	}
> +    }
> +
> +  lto_free_section_data (file_data, LTO_section_ipa_icf, NULL, data,
> +			 len);
> +  lto_data_in_delete (data_in);
> +}
> +
> +/* Read IPA IPA ICF summary for symbols.  */
> +
> +void
> +sem_item_optimizer::read_summary (void)
> +{
> +  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
> +  struct lto_file_decl_data *file_data;
> +  unsigned int j = 0;
> +
> +  while ((file_data = file_data_vec[j++]))
> +    {
> +      size_t len;
> +      const char *data = lto_get_section_data (file_data,
> +			 LTO_section_ipa_icf, NULL, &len);
> +
> +      if (data)
> +	read_section (file_data, data, len);
> +    }
> +}
> +
> +/* Register callgraph and varpool hooks.  */
> +
> +void
> +sem_item_optimizer::register_hooks (void)
> +{
> +  cgraph_node_hooks = cgraph_add_node_removal_hook (
> +			&sem_item_optimizer::cgraph_removal_hook, this);
> +
> +  varpool_node_hooks = varpool_add_node_removal_hook (
> +			 &sem_item_optimizer::varpool_removal_hook, this);
> +}
> +
> +/* Unregister callgraph and varpool hooks.  */
> +
> +void
> +sem_item_optimizer::unregister_hooks (void)
> +{
> +  if (cgraph_node_hooks)
> +    cgraph_remove_node_removal_hook (cgraph_node_hooks);
> +
> +  if (varpool_node_hooks)
> +    varpool_remove_node_removal_hook (varpool_node_hooks);
> +}
> +
> +/* Adds a CLS to hashtable associated by hash value.  */
> +
> +void
> +sem_item_optimizer::add_class (congruence_class *cls)
> +{
> +  gcc_assert (cls->members.length ());
> +
> +  congruence_class_group_t *group = get_group_by_hash (
> +				      cls->members[0]->get_hash ());
> +  group->classes.safe_push (cls);
> +}
> +
> +/* Gets a congruence class group based on given HASH value.  */
> +
> +congruence_class_group_t *
> +sem_item_optimizer::get_group_by_hash (hashval_t hash)
> +{
> +  congruence_class_group_t *item = XNEW (congruence_class_group_t);
> +  item->hash = hash;
> +
> +  congruence_class_group **slot = classes.find_slot (item, INSERT);
> +
> +  if (*slot)
> +    free (item);
> +  else
> +    {
> +      item->classes.create (1);
> +      *slot = item;
> +    }
> +
> +  return *slot;
> +}
> +
> +/* Callgraph removal hook called for a NODE with a custom DATA.  */
> +
> +void
> +sem_item_optimizer::cgraph_removal_hook (struct cgraph_node *node, void *data)
> +{
> +  sem_item_optimizer *optimizer = (sem_item_optimizer *) data;
> +  optimizer->remove_symtab_node (node);
> +}
> +
> +/* Varpool removal hook called for a NODE with a custom DATA.  */
> +
> +void
> +sem_item_optimizer::varpool_removal_hook (struct varpool_node *node, void *data)
> +{
> +  sem_item_optimizer *optimizer = (sem_item_optimizer *) data;
> +  optimizer->remove_symtab_node (node);
> +}
> +
> +/* Remove symtab NODE triggered by symtab removal hooks.  */
> +
> +void
> +sem_item_optimizer::remove_symtab_node (struct symtab_node *node)
> +{
> +  gcc_assert (!classes.elements());
> +
> +  pointer_set_insert (removed_items_set, node);
> +}
> +
> +/* Removes all callgraph and varpool nodes that are marked by symtab
> +   as deleted.  */
> +
> +void
> +sem_item_optimizer::filter_removed_items (void)
> +{
> +  vec <sem_item *> filtered;
> +  filtered.create (items.length());
> +
> +  for (unsigned int i = 0; i < items.length(); i++)
> +    {
> +      sem_item *item = items[i];
> +
> +      bool no_body_function = false;
> +
> +      if (item->type == FUNC)
> +	{
> +	  struct cgraph_node *cnode = static_cast <sem_function *>(item)->get_node ();
> +
> +	  no_body_function = in_lto_p && (cnode->alias || cnode->body_removed);
> +	}
> +
> +      if(!pointer_set_contains (removed_items_set, items[i]->node)
> +	  && !no_body_function)
> +	filtered.safe_push (items[i]);
> +    }
> +  items.release ();
> +
> +  for (unsigned int i = 0; i < filtered.length(); i++)
> +    items.safe_push (filtered[i]);
> +}
> +
> +/* Optimizer entry point.  */
> +
> +void
> +sem_item_optimizer::execute (void)
> +{
> +  filter_removed_items ();
> +  build_hash_based_classes ();
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Dump after hash based groups\n");
> +  dump_cong_classes ();
> +
> +  for (unsigned int i = 0; i < items.length(); i++)
> +    items[i]->init_wpa ();
> +
> +  subdivide_classes_by_equality (true);
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Dump after WPA based types groups\n");
> +  dump_cong_classes ();
> +
> +  parse_nonsingleton_classes ();
> +  subdivide_classes_by_equality ();
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Dump after full equality comparison of groups\n");
> +
> +  dump_cong_classes ();
> +
> +  unsigned int prev_class_count = classes_count;
> +
> +  process_cong_reduction ();
> +  dump_cong_classes ();
> +  merge_classes (prev_class_count);
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_symtab (dump_file);
> +}
> +
> +/* Function responsible for visiting all potential functions and
> +   read-only variables that can be merged.  */
> +
> +void
> +sem_item_optimizer::parse_funcs_and_vars (void)
> +{
> +  struct cgraph_node *cnode;
> +  sem_item **slot;
> +
> +  FOR_EACH_DEFINED_FUNCTION (cnode)
> +  {
> +    sem_function *f = sem_function::parse (cnode, &bmstack);
> +    if (f)
> +      {
> +	items.safe_push (f);
> +	slot = symtab_node_map.insert (cnode);
> +	*slot = f;
> +
> +	if (dump_file)
> +	  fprintf (dump_file, "Parsed function:%s\n", f->asm_name ());
> +
> +	if (dump_file && (dump_flags & TDF_DETAILS))
> +	  f->dump_to_file (dump_file);
> +      }
> +    else if (dump_file)
> +      fprintf (dump_file, "Not parsed function:%s\n", cnode->asm_name ());
> +  }
> +
> +  varpool_node *vnode;
> +
> +  FOR_EACH_DEFINED_VARIABLE (vnode)
> +  {
> +    sem_variable *v = sem_variable::parse (vnode, &bmstack);
> +
> +    if (v)
> +      {
> +	items.safe_push (v);
> +	slot = symtab_node_map.insert (vnode);
> +	*slot = v;
> +      }
> +  }
> +}
> +
> +/* Makes pairing between a congruence class CLS and semantic ITEM.  */
> +
> +void
> +sem_item_optimizer::add_item_to_class (congruence_class *cls, sem_item *item)
> +{
> +  item->index_in_class = cls->members.length ();
> +  cls->members.safe_push (item);
> +  item->cls = cls;
> +}
> +
> +/* Congruence classes are built by hash value.  */
> +
> +void
> +sem_item_optimizer::build_hash_based_classes (void)
> +{
> +  for (unsigned i = 0; i < items.length (); i++)
> +    {
> +      sem_item *item = items[i];
> +
> +      congruence_class_group_t *group = get_group_by_hash (item->get_hash ());
> +
> +      if (!group->classes.length ())
> +	{
> +	  classes_count++;
> +	  group->classes.safe_push (new congruence_class (class_id++));
> +	}
> +
> +      add_item_to_class (group->classes[0], item);
> +    }
> +}
> +
> +/* Semantic items in classes having more than one element and initialized.
> +   In case of WPA, we load function body.  */
> +
> +void
> +sem_item_optimizer::parse_nonsingleton_classes (void)
> +{
> +  for (hash_table <congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    {
> +      for (unsigned i = 0; i < (*it).classes.length (); i++)
> +	{
> +	  congruence_class *c = (*it).classes [i];
> +
> +	  if (c->members.length() > 1)
> +	    for (unsigned j = 0; j < c->members.length (); j++)
> +	      {
> +		sem_item *item = c->members[j];
> +		sem_item **slot;
> +
> +		slot = decl_map.insert (item->decl);
> +		*slot = item;
> +
> +	      }
> +	}
> +    }
> +
> +  unsigned int init_called_count = 0;
> +
> +  for (hash_table <congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    {
> +      /* We fill in all declarations for sem_items.  */
> +      for (unsigned i = 0; i < (*it).classes.length (); i++)
> +	{
> +	  congruence_class *c = (*it).classes [i];
> +
> +	  if (c->members.length() > 1)
> +	    for (unsigned j = 0; j < c->members.length (); j++)
> +	      {
> +		sem_item *item = c->members[j];
> +
> +		item->init ();
> +		item->init_refs ();
> +		init_called_count++;
> +
> +		for (unsigned j = 0; j < item->tree_refs.length (); j++)
> +		  {
> +		    sem_item **result = decl_map.contains (item->tree_refs[j]);
> +
> +		    if(result)
> +		      {
> +			sem_item *target = *result;
> +			item->refs.safe_push (target);
> +
> +			unsigned index = item->refs.length ();
> +			target->usages.safe_push (new sem_usage_pair(item, index));
> +			bitmap_set_bit (target->usage_index_bitmap, index);
> +			pointer_set_insert (item->tree_refs_set, item->tree_refs[j]);
> +		      }
> +		  }
> +	      }
> +	}
> +    }
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Init called for %u items (%.2f%%).\n", init_called_count,
> +	     100.0f * init_called_count / items.length ());
> +}
> +
> +/* Equality function for semantic items is used to subdivide existing
> +   classes. If IN_WPA, fast equality function is invoked.  */
> +
> +void
> +sem_item_optimizer::subdivide_classes_by_equality (bool in_wpa)
> +{
> +  for (hash_table <congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    {
> +      unsigned int class_count = (*it).classes.length ();
> +
> +      for (unsigned i = 0; i < class_count; i++)
> +	{
> +	  congruence_class *c = (*it).classes [i];
> +
> +	  if (c->members.length() > 1)
> +	    {
> +	      vec <sem_item *> new_vector;
> +	      new_vector.create (c->members.length ());
> +
> +	      sem_item *first = c->members[0];
> +	      new_vector.safe_push (first);
> +
> +	      unsigned class_split_first = (*it).classes.length ();
> +
> +	      for (unsigned j = 1; j < c->members.length (); j++)
> +		{
> +		  sem_item *item = c->members[j];
> +
> +		  bool equals = in_wpa ? first->equals_wpa (item) : first->equals (item);
> +
> +		  if (equals)
> +		    new_vector.safe_push (item);
> +		  else
> +		    {
> +		      bool integrated = false;
> +
> +		      for (unsigned k = class_split_first; k < (*it).classes.length (); k++)
> +			{
> +			  sem_item *x = (*it).classes[k]->members[0];
> +			  bool equals = in_wpa ? x->equals_wpa (item) : x->equals (item);
> +
> +			  if (equals)
> +			    {
> +			      integrated = true;
> +			      add_item_to_class ((*it).classes[k], item);
> +
> +			      break;
> +			    }
> +			}
> +
> +		      if (!integrated)
> +			{
> +			  congruence_class *c = new congruence_class (class_id++);
> +			  classes_count++;
> +			  add_item_to_class (c, item);
> +
> +			  (*it).classes.safe_push (c);
> +			}
> +		    }
> +		}
> +
> +	      // we replace newly created new_vector for the class we've just splitted
> +	      c->members.release ();
> +	      c->members.create (new_vector.length ());
> +
> +	      for (unsigned int j = 0; j < new_vector.length (); j++)
> +		add_item_to_class (c, new_vector[j]);
> +	    }
> +	}
> +    }
> +
> +  verify_classes ();
> +}
> +
> +/* Verify congruence classes if checking is enabled.  */
> +
> +void
> +sem_item_optimizer::verify_classes (void)
> +{
> +#if ENABLE_CHECKING
> +  for (hash_table <congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    {
> +      for (unsigned int i = 0; i < (*it).classes.length (); i++)
> +	{
> +	  congruence_class *cls = (*it).classes[i];
> +
> +	  gcc_checking_assert (cls);
> +	  gcc_checking_assert (cls->members.length () > 0);
> +
> +	  for (unsigned int j = 0; j < cls->members.length (); j++)
> +	    {
> +	      sem_item *item = cls->members[j];
> +
> +	      gcc_checking_assert (item);
> +
> +	      for (unsigned k = 0; k < item->usages.length (); k++)
> +		{
> +		  sem_usage_pair *usage = item->usages[k];
> +		  gcc_checking_assert (usage->item->index_in_class <
> +				       usage->item->cls->members.length ());
> +		}
> +	    }
> +	}
> +    }
> +#endif
> +}
> +
> +/* Disposes split map traverse function. CLS_PTR is pointer to congruence
> +   class, BSLOT is bitmap slot we want to release. DATA is mandatory,
> +   but unused argument.  */
> +
> +bool
> +sem_item_optimizer::release_split_map (__attribute__((__unused__)) const void
> +				       *cls_ptr,
> +				       __attribute__((__unused__)) bitmap *bslot,
> +				       __attribute__((__unused__)) void *data)
> +{
> +  bitmap b = *bslot;
> +
> +  BITMAP_FREE (b);
> +
> +  return true;
> +}
> +
> +/* Process split operation for a class given as pointer CLS_PTR,
> +   where bitmap B splits congruence class members. DATA is used
> +   as argument of split pair.  */
> +
> +bool
> +sem_item_optimizer::traverse_congruence_split (const void *cls_ptr,
> +    bitmap *bslot, void *data)
> +{
> +  const congruence_class *cls = (const congruence_class *) cls_ptr;
> +  bitmap b = *bslot;
> +
> +  traverse_split_pair *pair = (traverse_split_pair *) data;
> +  sem_item_optimizer *optimizer = pair->optimizer;
> +  const congruence_class *splitter_cls = pair->cls;
> +
> +  /* If counted bits are greater than zero and less than the number of members
> +     a group will be splitted.  */
> +  unsigned popcount = bitmap_count_bits (b);
> +
> +  if (popcount > 0 && popcount < cls->members.length ())
> +    {
> +      congruence_class* newclasses[2] = { new congruence_class (class_id++), new congruence_class (class_id++) };
> +
> +      for (unsigned int i = 0; i < cls->members.length (); i++)
> +	{
> +	  int target = bitmap_bit_p (b, i);
> +	  congruence_class *tc = newclasses[target];
> +
> +	  add_item_to_class (tc, cls->members[i]);
> +	}
> +
> +#ifdef ENABLE_CHECKING
> +      for (unsigned int i = 0; i < 2; i++)
> +	gcc_checking_assert (newclasses[i]->members.length ());
> +#endif
> +
> +      if (splitter_cls == cls)
> +	optimizer->splitter_class_removed = true;
> +
> +      /* Remove old class from worklist if presented.  */
> +      bool in_work_list = optimizer->worklist_contains (cls);
> +
> +      if (in_work_list)
> +	optimizer->worklist_remove (cls);
> +
> +      for (hash_table <congruence_class_group_hash>::iterator it =
> +	     optimizer->classes.begin (); it != optimizer->classes.end (); ++it)
> +	{
> +	  for (unsigned int i = 0; i < (*it).classes.length (); i++)
> +	    if ((*it).classes[i] == cls)
> +	      {
> +		(*it).classes.ordered_remove (i);
> +		break;
> +	      }
> +	}
> +      /* New class will be inserted and integrated to work list.  */
> +      for (unsigned int i = 0; i < 2; i++)
> +	optimizer->add_class (newclasses[i]);
> +
> +      /* Two classes replace one, so that increment just by one.  */
> +      optimizer->classes_count++;
> +
> +      /* If OLD class was presented in the worklist, we remove the class
> +         are replace it will both newly created classes.  */
> +      if (in_work_list)
> +	for (unsigned int i = 0; i < 2; i++)
> +	  optimizer->worklist_push (newclasses[i]);
> +      else /* Just smaller class is inserted.  */
> +	{
> +	  unsigned int smaller_index = newclasses[0]->members.length () <
> +				       newclasses[1]->members.length () ?
> +				       0 : 1;
> +	  optimizer->worklist_push (newclasses[smaller_index]);
> +	}
> +
> +      if (dump_file && (dump_flags & TDF_DETAILS))
> +	{
> +	  fprintf (dump_file, "  congruence class splitted:\n");
> +	  cls->dump (dump_file, 4);
> +
> +	  fprintf (dump_file, "  newly created groups:\n");
> +	  for (unsigned int i = 0; i < 2; i++)
> +	    newclasses[i]->dump (dump_file, 4);
> +	}
> +
> +      delete cls;
> +    }
> +
> +
> +  return true;
> +}
> +
> +/* Tests if a class CLS used as INDEXth splits any congruence classes.
> +   Bitmap stack BMSTACK is used for bitmap allocation.  */
> +
> +void
> +sem_item_optimizer::do_congruence_step_for_index (congruence_class *cls,
> +    unsigned int index)
> +{
> +  /* Split map reset */
> +  if (split_map != NULL)
> +    delete split_map;
> +
> +  pointer_map <bitmap> *split_map = new pointer_map <bitmap> ();
> +
> +  for (unsigned int i = 0; i < cls->members.length (); i++)
> +    {
> +      sem_item *item = cls->members[i];
> +
> +      /* Iterate all usages that have INDEX as usage of the item.  */
> +      for (unsigned int j = 0; j < item->usages.length (); j++)
> +	{
> +	  sem_usage_pair *usage = item->usages[j];
> +
> +	  if (usage->index != index)
> +	    continue;
> +
> +	  bitmap *slot = split_map->contains (usage->item->cls);
> +
> +	  if(!slot)
> +	    {
> +	      slot = split_map->insert (usage->item->cls);
> +	      *slot = BITMAP_ALLOC (&bmstack);
> +	    }
> +
> +	  bitmap b = *slot;
> +
> +#if ENABLE_CHECKING
> +	  gcc_checking_assert (usage->item->cls);
> +	  gcc_checking_assert (usage->item->index_in_class <
> +			       usage->item->cls->members.length ());
> +#endif
> +
> +	  bitmap_set_bit (b, usage->item->index_in_class);
> +	}
> +    }
> +
> +  traverse_split_pair pair;
> +  pair.optimizer = this;
> +  pair.cls = cls;
> +
> +  splitter_class_removed = false;
> +  split_map->traverse (&sem_item_optimizer::traverse_congruence_split, &pair);
> +
> +  /* Bitmap clean-up.  */
> +  split_map->traverse (&sem_item_optimizer::release_split_map, NULL);
> +}
> +
> +/* Every usage of a congruence class CLS is a candidate that can split the
> +   collection of classes. Bitmap stack BMSTACK is used for bitmap
> +   allocation.  */
> +
> +void
> +sem_item_optimizer::do_congruence_step (congruence_class *cls)
> +{
> +  bitmap_iterator bi;
> +  unsigned int i;
> +
> +  bitmap usage = BITMAP_ALLOC (&bmstack);
> +
> +  for (unsigned int i = 0; i < cls->members.length (); i++)
> +    bitmap_ior_into (usage, cls->members[i]->usage_index_bitmap);
> +
> +  EXECUTE_IF_SET_IN_BITMAP (usage, 0, i, bi)
> +  {
> +    if (dump_file && (dump_flags & TDF_DETAILS))
> +      fprintf (dump_file, "  processing congruece step for class: %u, index: %u\n",
> +	       cls->id, i);
> +
> +    do_congruence_step_for_index (cls, i);
> +
> +    if (splitter_class_removed)
> +      break;
> +  }
> +
> +  BITMAP_FREE (usage);
> +}
> +
> +/* Adds a newly created congruence class CLS to worklist.  */
> +
> +void
> +sem_item_optimizer::worklist_push (congruence_class *cls)
> +{
> +  congruence_class **slot = worklist.find_slot (cls, INSERT);
> +
> +  if (*slot)
> +    return;
> +
> +  *slot = cls;
> +}
> +
> +/* Pops a class from worklist. */
> +
> +congruence_class *
> +sem_item_optimizer::worklist_pop (void)
> +{
> +  gcc_assert (worklist.elements ());
> +
> +  congruence_class *cls = &(*worklist.begin ());
> +  worklist.remove_elt (cls);
> +
> +  return cls;
> +}
> +
> +/* Returns true if a congruence class CLS is presented in worklist.  */
> +
> +bool
> +sem_item_optimizer::worklist_contains (const congruence_class *cls)
> +{
> +  return worklist.find (cls);
> +}
> +
> +/* Removes given congruence class CLS from worklist.  */
> +
> +void
> +sem_item_optimizer::worklist_remove (const congruence_class *cls)
> +{
> +  worklist.remove_elt (cls);
> +}
> +
> +/* Iterative congruence reduction function.  */
> +
> +void
> +sem_item_optimizer::process_cong_reduction (void)
> +{
> +  for (hash_table<congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    for (unsigned i = 0; i < (*it).classes.length (); i++)
> +      if ((*it).classes[i]->is_class_used ())
> +	worklist_push ((*it).classes[i]);
> +
> +  if (dump_file)
> +    fprintf (dump_file, "Worklist has been filled with: %lu\n",
> +	     worklist.elements ());
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    fprintf (dump_file, "Congruence class reduction\n");
> +
> +  while (worklist.elements ())
> +    {
> +      congruence_class *cls = worklist_pop ();
> +      do_congruence_step (cls);
> +    }
> +}
> +
> +/* Debug function prints all informations about congruence classes.  */
> +
> +void
> +sem_item_optimizer::dump_cong_classes (void)
> +{
> +  if (!dump_file)
> +    return;
> +
> +  fprintf (dump_file,
> +	   "Congruence classes: %u (unique hash values: %lu), with total: %u items\n",
> +	   classes_count, classes.elements(), items.length ());
> +
> +  /* Histogram calculation.  */
> +  unsigned int max_index = 0;
> +  unsigned int* histogram = XCNEWVEC (unsigned int, items.length ());
> +
> +  for (hash_table<congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +
> +    for (unsigned i = 0; i < (*it).classes.length (); i++)
> +      {
> +	unsigned int c = (*it).classes[i]->members.length ();
> +	histogram[c]++;
> +
> +	if (c > max_index)
> +	  max_index = c;
> +      }
> +
> +  fprintf (dump_file,
> +	   "Class size histogram [num of members]: number of classe number of classess\n");
> +
> +  for (unsigned int i = 0; i <= max_index; i++)
> +    if (histogram[i])
> +      fprintf (dump_file, "[%u]: %u classes\n", i, histogram[i]);
> +
> +  fprintf (dump_file, "\n\n");
> +
> +
> +  if (dump_flags & TDF_DETAILS)
> +    for (hash_table<congruence_class_group_hash>::iterator it = classes.begin ();
> +	 it != classes.end (); ++it)
> +      {
> +	fprintf (dump_file, "  group: with %u classes:\n", (*it).classes.length ());
> +
> +	for (unsigned i = 0; i < (*it).classes.length (); i++)
> +	  {
> +	    (*it).classes[i]->dump (dump_file, 4);
> +
> +	    if(i < (*it).classes.length () - 1)
> +	      fprintf (dump_file, " ");
> +	  }
> +      }
> +
> +  free (histogram);
> +}
> +
> +/* After reduction is done, we can declare all items in a group
> +   to be equal. PREV_CLASS_COUNT is start number of classes
> +   before reduction.  */
> +
> +void
> +sem_item_optimizer::merge_classes (unsigned int prev_class_count)
> +{
> +  unsigned int item_count = items.length ();
> +  unsigned int class_count = classes_count;
> +  unsigned int equal_items = item_count - class_count;
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "\nItem count: %u\n", item_count);
> +      fprintf (dump_file, "Congruent classes before: %u, after: %u\n",
> +	       prev_class_count, class_count);
> +      fprintf (dump_file, "Average class size before: %.2f, after: %.2f\n",
> +	       1.0f * item_count / prev_class_count,
> +	       1.0f * item_count / class_count);
> +      fprintf (dump_file, "Equal symbols: %u\n", equal_items);
> +      fprintf (dump_file, "Fraction of visited symbols: %.2f%%\n\n",
> +	       100.0f * equal_items / item_count);
> +    }
> +
> +  for (hash_table<congruence_class_group_hash>::iterator it = classes.begin ();
> +       it != classes.end (); ++it)
> +    for (unsigned int i = 0; i < (*it).classes.length (); i++)
> +      {
> +	congruence_class *c = (*it).classes[i];
> +
> +	if (c->members.length () == 1)
> +	  continue;
> +
> +	gcc_assert (c->members.length ());
> +
> +	sem_item *source = c->members[0];
> +
> +	for (unsigned int j = 1; j < c->members.length (); j++)
> +	  {
> +	    sem_item *alias = c->members[j];
> +
> +	    if (dump_file)
> +	      {
> +		fprintf (dump_file, "Semantic equality hit:%s->%s\n",
> +			 source->name (), alias->name ());
> +		fprintf (dump_file, "Assembler symbol names:%s->%s\n",
> +			 source->asm_name (), alias->asm_name ());
> +	      }
> +
> +	    if (dump_file && (dump_flags & TDF_DETAILS))
> +	      {
> +		source->dump_to_file (dump_file);
> +		alias->dump_to_file (dump_file);
> +	      }
> +
> +	    source->merge (alias);
> +	  }
> +      }
> +}
> +
> +/* Dump function prints all class members to a FILE with an INDENT.  */
> +
> +void
> +congruence_class::dump (FILE *file, unsigned int indent) const
> +{
> +  FPRINTF_SPACES (file, indent, "class with id: %u, hash: %u, items: %u\n",
> +		  id, members[0]->get_hash (), members.length ());
> +
> +  FPUTS_SPACES (file, indent + 2, "");
> +  for (unsigned i = 0; i < members.length (); i++)
> +    fprintf (file, "%s(%p/%u)", members[i]->asm_name (), (void *) members[i]->decl,
> +	     members[i]->node->order);
> +
> +  fprintf (file, "\n");
> +}
> +
> +/* Returns true if there's a member that is used from another group.  */
> +
> +bool
> +congruence_class::is_class_used (void)
> +{
> +  for (unsigned int i = 0; i < members.length (); i++)
> +    if (members[i]->usages.length ())
> +      return true;
> +
> +  return false;
> +}
> +
> +/* Initialization and computation of symtab node hash, there data
> +   are propagated later on.  */
> +
> +static sem_item_optimizer optimizer;
> +
> +/* Generate pass summary for IPA ICF pass.  */
> +
> +static void
> +ipa_icf_generate_summary (void)
> +{
> +  optimizer.parse_funcs_and_vars ();
> +}
> +
> +/* Write pass summary for IPA ICF pass.  */
> +
> +static void
> +ipa_icf_write_summary (void)
> +{
> +  optimizer.write_summary ();
> +}
> +
> +/* Read pass summary for IPA ICF pass.  */
> +
> +static void
> +ipa_icf_read_summary (void)
> +{
> +  optimizer.read_summary ();
> +  optimizer.register_hooks ();
> +}
> +
> +/* Semantic equality exection function.  */
> +
> +static unsigned int
> +ipa_icf_driver (void)
> +{
> +  optimizer.execute ();
> +  optimizer.unregister_hooks ();
> +
> +  return 0;
> +}
> +
> +const pass_data pass_data_ipa_icf =
> +{
> +  IPA_PASS,		    /* type */
> +  "icf",		    /* name */
> +  OPTGROUP_IPA,             /* optinfo_flags */
> +  true,                     /* has_execute */
> +  TV_IPA_ICF,		    /* tv_id */
> +  0,                        /* properties_required */
> +  0,                        /* properties_provided */
> +  0,                        /* properties_destroyed */
> +  0,                        /* todo_flags_start */
> +  0,                        /* todo_flags_finish */
> +};
> +
> +class pass_ipa_icf : public ipa_opt_pass_d
> +{
> +public:
> +  pass_ipa_icf (gcc::context *ctxt)
> +    : ipa_opt_pass_d (pass_data_ipa_icf, ctxt,
> +		      ipa_icf_generate_summary, /* generate_summary */
> +		      ipa_icf_write_summary, /* write_summary */
> +		      ipa_icf_read_summary, /* read_summary */
> +		      NULL, /*
> +		      write_optimization_summary */
> +		      NULL, /*
> +		      read_optimization_summary */
> +		      NULL, /* stmt_fixup */
> +		      0, /* function_transform_todo_flags_start */
> +		      NULL, /* function_transform */
> +		      NULL) /* variable_transform */
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *)
> +  {
> +    return flag_ipa_icf;
> +  }
> +
> +  virtual unsigned int execute (function *)
> +  {
> +    return ipa_icf_driver();
> +  }
> +}; // class pass_ipa_icf
> +
> +} // anon namespace
> +
> +ipa_opt_pass_d *
> +make_pass_ipa_icf (gcc::context *ctxt)
> +{
> +  return new pass_ipa_icf (ctxt);
> +}
> diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
> new file mode 100644
> index 0000000..dc5e971
> --- /dev/null
> +++ b/gcc/ipa-icf.h
> @@ -0,0 +1,687 @@
> +/* Interprocedural semantic function equality pass
> +   Copyright (C) 2014 Free Software Foundation, Inc.
> +
> +   Contributed by Jan Hubicka <hubicka@ucw.cz> and Martin Liska <mliska@suse.cz>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* Prints string STRING to a FILE with a given number of SPACE_COUNT.  */
> +#define FPUTS_SPACES(file, space_count, string) \
> +  do \
> +  { \
> +    fprintf (file, "%*s" string, space_count, " "); \
> +  } \
> +  while (false);
> +
> +/* fprintf function wrapper that transforms given FORMAT to follow given
> +   number for SPACE_COUNT and call fprintf for a FILE.  */
> +#define FPRINTF_SPACES(file, space_count, format, ...) \
> +  fprintf (file, "%*s" format, space_count, " ", ##__VA_ARGS__);
> +
> +/* Prints a MESSAGE to dump_file if exists.  */
> +#define SE_DUMP_MESSAGE(message) \
> +  do \
> +  { \
> +    if (dump_file && (dump_flags & TDF_DETAILS)) \
> +      fprintf (dump_file, "  debug message: %s (%s:%u)\n", message, __func__, __LINE__); \
> +  } \
> +  while (false);
> +
> +/* Logs a MESSAGE to dump_file if exists and returns false.  */
> +#define SE_EXIT_FALSE_WITH_MSG(message) \
> +  do \
> +  { \
> +    if (dump_file && (dump_flags & TDF_DETAILS)) \
> +      fprintf (dump_file, "  false returned: '%s' (%s:%u)\n", message, __func__, __LINE__); \
> +    return false; \
> +  } \
> +  while (false);
> +
> +/* Return false and log that false value is returned.  */
> +#define SE_EXIT_FALSE() \
> +  SE_EXIT_FALSE_WITH_MSG("")
> +
> +/* Logs return value if RESULT is false.  */
> +#define SE_EXIT_DEBUG(result) \
> +  do \
> +  { \
> +    if (!(result) && dump_file && (dump_flags & TDF_DETAILS)) \
> +      fprintf (dump_file, "  false returned (%s:%u)\n", __func__, __LINE__); \
> +    return result; \
> +  } \
> +  while (false);
> +
> +/* Verbose logging function logging statements S1 and S2 of a CODE.  */
> +#define SE_DIFF_STATEMENT(s1, s2, code) \
> +  do \
> +  { \
> +    if (dump_file && (dump_flags & TDF_DETAILS)) \
> +      { \
> +        fprintf (dump_file, "  different statement for code: %s:\n", code); \
> +        print_gimple_stmt (dump_file, s1, 3, TDF_DETAILS); \
> +        print_gimple_stmt (dump_file, s2, 3, TDF_DETAILS); \
> +      } \
> +    return false; \
> +  } \
> +  while (false);
> +
> +namespace {
> +
> +/* Forward declaration for sem_func class.  */
> +class sem_item;
> +
> +/* A class aggregating all connections and semantic equivalents
> +   for a given pair of semantic function candidates.  */
> +class func_checker
> +{
> +public:
> +  func_checker ();
> +
> +  /* Initializes internal structures according to given number of
> +     source and target SSA names. The number of source names is SSA_SOURCE,
> +     respectively SSA_TARGET.  */
> +  void initialize (unsigned ssa_source, unsigned sss_target);
> +
> +  /* Memory release routine.  */
> +  void release (void);
> +
> +  /* Verifies that trees T1 and T2 do correspond.  */
> +  bool compare_ssa_name (tree t1, tree t2);
> +
> +  /* Verification function for edges E1 and E2.  */
> +  bool compare_edge (edge e1, edge e2);
> +
> +  /* Verification function for declaration trees T1 and T2 that
> +     come from functions FUNC1 and FUNC2.  */
> +  bool compare_decl (tree t1, tree t2, tree func1, tree func2);
> +
> +private:
> +  /* Vector mapping source SSA names to target ones.  */
> +  vec <int> source_ssa_names;
> +
> +  /* Vector mapping target SSA names to source ones.  */
> +  vec <int> target_ssa_names;
> +
> +  /* Source to target edge map.  */
> +  pointer_map <edge> *edge_map;
> +
> +  /* Source to target declaration map.  */
> +  pointer_map <tree> *decl_map;
> +
> +  /* Flag that indicates if the checker is initialize.  */
> +  bool initialized;
> +};
> +
> +/* Congruence class encompasses a collection of either functions or
> +   read-only variables. These items are considered to be equivalent
> +   if not proved the oposite.  */
> +class congruence_class
> +{
> +public:
> +  /* Congruence class constructor for a new class with _ID.  */
> +  congruence_class (unsigned int _id);
> +
> +  /* Dump function prints all class members to a FILE with an INDENT.  */
> +  void dump (FILE *file, unsigned int indent = 0) const;
> +
> +  /* Returns true if there's a member that is used from another group.  */
> +  bool is_class_used (void);
> +
> +  /* Vector of all group members.  */
> +  vec <sem_item *> members;
> +
> +  /* Global unique class identifier.  */
> +  unsigned int id;
> +};
> +
> +/* Semantic item type enum.  */
> +enum sem_item_type
> +{
> +  FUNC,
> +  VAR
> +};
> +
> +/* Semantic item usage pair.  */
> +class sem_usage_pair
> +{
> +public:
> +  /* Constructor for key value pair, where _ITEM is key and _INDEX is a target.  */
> +  sem_usage_pair (sem_item *_item, unsigned int _index);
> +
> +  /* Target semantic item where an item is used.  */
> +  sem_item *item;
> +
> +  /* Index of usage of such an item.  */
> +  unsigned int index;
> +};
> +
> +/* Basic block struct for sematic equality pass.  */
> +typedef struct sem_bb
> +{
> +  /* Basic block the structure belongs to.  */
> +  basic_block bb;
> +
> +  /* Number of non-debug statements in the basic block.  */
> +  unsigned nondbg_stmt_count;
> +
> +  /* Number of edges connected to the block.  */
> +  unsigned edge_count;
> +} sem_bb_t;
> +
> +/* Semantic item is a base class that encapsulates all shared functionality
> +   for both semantic function and variable items.  */
> +class sem_item
> +{
> +public:
> +  /* Semantic item constructor for a node of _TYPE, where STACK is used
> +     for bitmap memory allocation.  */
> +  sem_item (enum sem_item_type _type, bitmap_obstack *stack);
> +
> +  /* Semantic item constructor for a node of _TYPE, where STACK is used
> +     for bitmap memory allocation. The item is based on symtab node _NODE
> +     with computed _HASH.  */
> +  sem_item (enum sem_item_type _type, struct symtab_node *_node, hashval_t _hash,
> +	    bitmap_obstack *stack);
> +
> +  virtual ~sem_item ();
> +
> +  /* Dump function for debugging purpose.  */
> +  DEBUG_FUNCTION void dump (void);
> +
> +  /* Initialize semantic item by info reachable during LTO WPA phase.  */
> +  virtual void init_wpa (void) = 0;
> +
> +  /* Semantic item initialization function.  */
> +  virtual void init (void) = 0;
> +
> +  /* Gets symbol name of the item.  */
> +  virtual const char *name (void) = 0;
> +
> +  /* Gets assembler name of the item.  */
> +  virtual const char *asm_name (void) = 0;
> +
> +  /* Initializes references to other semantic functions/variables.  */
> +  virtual void init_refs () = 0;
> +
> +  /* Fast equality function based on knowledge known in WPA.  */
> +  virtual bool equals_wpa (sem_item *item) = 0;
> +
> +  /* Returns true if the item equals to ITEM given as arguemnt.  */
> +  virtual bool equals (sem_item *item) = 0;
> +
> +  /* References independent hash function.  */
> +  virtual hashval_t get_hash (void) = 0;
> +
> +  /* Merges instance with an ALIAS_ITEM, where alias, thunk or redirection can
> +     be applied.  */
> +  virtual bool merge (sem_item *alias_item) = 0;
> +
> +  /* Dump symbol to FILE.  */
> +  virtual void dump_to_file (FILE *file) = 0;
> +
> +  /* Compare two types if are same aliases in case of strict aliasing
> +     is enabled.  */
> +  static bool compare_for_aliasing (tree t1, tree t2);
> +
> +  /* Item type.  */
> +  enum sem_item_type type;
> +
> +  /* Global unique function index.  */
> +  unsigned int index;
> +
> +  /* Symtab node.  */
> +  struct symtab_node *node;
> +
> +  /* Declaration tree node.  */
> +  tree decl;
> +
> +  /* Semantic references used that generate congruence groups.  */
> +  vec <sem_item *> refs;
> +
> +  /* Pointer to a congruence class the item belongs to.  */
> +  congruence_class *cls;
> +
> +  /* Index of the item in a class belonging to.  */
> +  unsigned int index_in_class;
> +
> +  /* List of semantic items where the instance is used.  */
> +  vec <sem_usage_pair *> usages;
> +
> +  /* A bitmap with indices of all classes referencing this item.  */
> +  bitmap usage_index_bitmap;
> +
> +  /* List of tree references (either FUNC_DECL or VAR_DECL).  */
> +  vec <tree> tree_refs;
> +
> +  /* A set with tree references (either FUNC_DECL or VAR_DECL).  */
> +  pointer_set_t *tree_refs_set;
> +
> +protected:
> +  /* Cached, once calculated hash for the item.  */
> +  hashval_t hash;
> +
> +private:
> +  /* Initialize internal data structures. Bitmap STACK is used for
> +     bitmap memory allocation process.  */
> +  void setup (bitmap_obstack *stack);
> +}; // class sem_item
> +
> +class sem_function: public sem_item
> +{
> +public:
> +  /* Semantic function constructor that uses STACK as bitmap memory stack.  */
> +  sem_function (bitmap_obstack *stack);
> +
> +  /*  Constructor based on callgraph node _NODE with computed hash _HASH.
> +      Bitmap STACK is used for memory allocation.  */
> +  sem_function (cgraph_node *_node, hashval_t _hash, bitmap_obstack *stack);
> +
> +  ~sem_function ();
> +
> +  virtual void init_wpa (void);
> +  virtual void init (void);
> +  virtual const char *name (void);
> +  virtual const char *asm_name (void);
> +  virtual hashval_t get_hash (void);
> +  virtual bool equals_wpa (sem_item *item);
> +  virtual bool equals (sem_item *item);
> +  virtual void init_refs ();
> +  virtual bool merge (sem_item *alias_item);
> +  virtual void dump_to_file (FILE *file);
> +
> +  /* Parses function arguments and result type.  */
> +  void parse_tree_args (void);
> +
> +  /* Returns cgraph_node.  */
> +  struct cgraph_node *get_node (void);
> +
> +  /* For a given call graph NODE, the function constructs new
> +     semantic function item.  */
> +  static sem_function *parse (struct cgraph_node *node, bitmap_obstack *stack);
> +
> +  /* Exception handling region tree.  */
> +  eh_region region_tree;
> +
> +  /* Result type tree node.  */
> +  tree result_type;
> +
> +  /* Array of argument tree types.  */
> +  vec <tree> arg_types;
> +
> +  /* Number of function arguments.  */
> +  unsigned int arg_count;
> +
> +  /* Basic block count.  */
> +  unsigned int bb_count;
> +
> +  /* Total amount of edges in the function.  */
> +  unsigned int edge_count;
> +
> +  /* Array of sizes of all basic blocks.  */
> +  unsigned int *bb_sizes;
> +
> +  /* Control flow graph checksum.  */
> +  hashval_t cfg_checksum;
> +
> +  /* GIMPLE codes hash value.  */
> +  hashval_t gcode_hash;
> +
> +  /* Total number of SSA names used in the function.  */
> +  unsigned ssa_names_size;
> +
> +  /* Array of structures for all basic blocks.  */
> +  sem_bb_t **bb_sorted;
> +
> +private:
> +  /* Calculates hash value based on a BASIC_BLOCK.  */
> +  hashval_t get_bb_hash (const sem_bb_t *basic_block);
> +
> +  /* Basic block equivalence comparison function that returns true if
> +     basic blocks BB1 and BB2 (from functions FUNC1 and FUNC2) correspond.  */
> +  bool compare_bb (sem_bb_t *bb1, sem_bb_t *bb2, tree func1, tree func2);
> +
> +  /* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
> +     true value is returned if phi nodes are sematically
> +     equivalent in these blocks .  */
> +  bool compare_phi_node (basic_block bb1, basic_block bb2, tree func1,
> +			 tree func2);
> +
> +  /* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
> +     true value is returned if exception handling regions are equivalent
> +     in these blocks.  */
> +  bool compare_eh_region (eh_region r1, eh_region r2, tree func1, tree func2);
> +
> +  /* Iterates GSI statement iterator to the next non-debug statement.  */
> +  void gsi_next_nondebug_stmt (gimple_stmt_iterator &gsi);
> +
> +  /* Iterates GSI statement iterator to the next non-virtual statement.  */
> +  void gsi_next_nonvirtual_phi (gimple_stmt_iterator &it);
> +
> +  /* Verifies that trees T1 and T2 do correspond.  */
> +  bool compare_function_decl (tree t1, tree t2);
> +
> +  /* Verifies that trees T1 and T2 do correspond.  */
> +  bool compare_variable_decl (tree t1, tree t2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     call statements are semantically equivalent.  */
> +  bool compare_gimple_call (gimple s1, gimple s2,
> +			    tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     assignment statements are semantically equivalent.  */
> +  bool compare_gimple_assign (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     condition statements are semantically equivalent.  */
> +  bool compare_gimple_cond (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     label statements are semantically equivalent.  */
> +  bool compare_gimple_label (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     switch statements are semantically equivalent.  */
> +  bool compare_gimple_switch (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     return statements are semantically equivalent.  */
> +  bool compare_gimple_return (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     goto statements are semantically equivalent.  */
> +  bool compare_gimple_goto (gimple s1, gimple s2, tree func1, tree func2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 (from function FUNC1, resp. FUNC2) that
> +     resx statements are semantically equivalent.  */
> +  bool compare_gimple_resx (gimple s1, gimple s2);
> +
> +  /* Verifies for given GIMPLEs S1 and S2 that ASM statements are equivalent.
> +     For the beginning, the pass only supports equality for
> +     '__asm__ __volatile__ ("", "", "", "memory")'.  */
> +  bool compare_gimple_asm (gimple s1, gimple s2);
> +
> +  /* Verifies that tree labels T1 and T2 correspond in FUNC1 and FUNC2.  */
> +  bool compare_tree_ssa_label (tree t1, tree t2, tree func1, tree func2);
> +
> +  /* Function compares two operands T1 and T2 and returns true if these
> +     two trees from FUNC1 (respectively FUNC2) are semantically equivalent.  */
> +  bool compare_operand (tree t1, tree t2, tree func1, tree func2);
> +
> +  /* If T1 and T2 are SSA names, dictionary comparison is processed. Otherwise,
> +     declaration comparasion is executed.  */
> +  bool compare_ssa_name (tree t1, tree t2, tree func1, tree func2);
> +
> +  /* Basic blocks dictionary BB_DICT returns true if SOURCE index BB
> +     corresponds to TARGET.  */
> +  bool bb_dict_test (int* bb_dict, int source, int target);
> +
> +  /* Iterates all tree types in T1 and T2 and returns true if all types
> +     are compatible.  */
> +  bool compare_type_list (tree t1, tree t2);
> +
> +  /* Processes function equality comparison.  */
> +  bool equals_private (sem_item *item);
> +
> +  /* Initializes references to another sem_item for gimple STMT of type assign.  */
> +  void init_refs_for_assign (gimple stmt);
> +
> +  /* Initializes references to another sem_item for tree T.  */
> +  void init_refs_for_tree (tree t);
> +
> +  /* Returns true if tree T can be compared as a handled component.  */
> +  static bool icf_handled_component_p (tree t);
> +
> +  /* Function checker stores binding between functions.   */
> +  func_checker checker;
> +
> +  /* COMPARED_FUNC is a function that we compare to.  */
> +  sem_function *compared_func;
> +}; // class sem_function
> +
> +class sem_variable: public sem_item
> +{
> +public:
> +  /* Semantic variable constructor that uses STACK as bitmap memory stack.  */
> +  sem_variable (bitmap_obstack *stack);
> +
> +  /*  Constructor based on callgraph node _NODE with computed hash _HASH.
> +      Bitmap STACK is used for memory allocation.  */
> +
> +  sem_variable (varpool_node *_node, hashval_t _hash, bitmap_obstack *stack);
> +
> +  virtual void init_wpa (void);
> +  virtual void init (void);
> +  virtual const char *name (void);
> +  virtual const char *asm_name (void);
> +  virtual void init_refs ();
> +  virtual hashval_t get_hash (void);
> +  virtual bool merge (sem_item *alias_item);
> +  virtual void dump_to_file (FILE *file);
> +  virtual bool equals_wpa (sem_item *item);
> +  virtual bool equals (sem_item *item);
> +
> +  /* Returns varpool_node.  */
> +  struct varpool_node *get_node (void);
> +
> +  /* Parser function that visits a varpool NODE.  */
> +  static sem_variable *parse (struct varpool_node *node, bitmap_obstack *stack);
> +
> +  /* Variable constructor.  */
> +  tree ctor;
> +
> +private:
> +  /* Iterates though a constructor and identifies tree references
> +     we are interested in semantic function equality.  */
> +  void parse_tree_refs (tree t);
> +
> +  /* Compares trees T1 and T2 for semantic equality.  */
> +  static bool equals (tree t1, tree t2);
> +
> +}; // class sem_variable
> +
> +class sem_item_optimizer;
> +
> +/* Congruence class set structure.  */
> +struct congruence_class_var_hash: typed_noop_remove <congruence_class>
> +{
> +  typedef congruence_class value_type;
> +  typedef congruence_class compare_type;
> +  static inline hashval_t hash (const value_type *);
> +  static inline int equal (const value_type *, const compare_type *);
> +};
> +
> +typedef struct congruence_class_group
> +{
> +  hashval_t hash;
> +  vec <congruence_class *> classes;
> +} congruence_class_group_t;
> +
> +/* Congruence class set structure.  */
> +struct congruence_class_group_hash: typed_noop_remove <congruence_class_group_t>
> +{
> +  typedef congruence_class_group value_type;
> +  typedef congruence_class_group compare_type;
> +  static inline hashval_t hash (const value_type *);
> +  static inline int equal (const value_type *, const compare_type *);
> +};
> +
> +struct traverse_split_pair
> +{
> +  sem_item_optimizer *optimizer;
> +  class congruence_class *cls;
> +};
> +
> +/* Semantic item optimizer includes all top-level logic
> +   related to semantic equality comparison.  */
> +class sem_item_optimizer
> +{
> +public:
> +  sem_item_optimizer ();
> +  ~sem_item_optimizer ();
> +
> +  /* Function responsible for visiting all potential functions and
> +     read-only variables that can be merged.  */
> +  void parse_funcs_and_vars (void);
> +
> +  /* Optimizer entry point.  */
> +  void execute (void);
> +
> +  /* Dump function. */
> +  void dump (void);
> +
> +  /* Verify congruence classes if checking is enabled.  */
> +  void verify_classes (void);
> +
> +  /* Write IPA ICF summary for symbols.  */
> +  void write_summary (void);
> +
> +  /* Read IPA IPA ICF summary for symbols.  */
> +  void read_summary (void);
> +
> +  /* Callgraph removal hook called for a NODE with a custom DATA.  */
> +  static void cgraph_removal_hook (struct cgraph_node *node, void *data);
> +
> +  /* Varpool removal hook called for a NODE with a custom DATA.  */
> +  static void varpool_removal_hook (struct varpool_node *node, void *data);
> +
> +  /* Worklist of congruence classes that can potentially
> +     refine classes of congruence.  */
> +  hash_table <congruence_class_var_hash> worklist;
> +
> +  /* Remove symtab NODE triggered by symtab removal hooks.  */
> +  void remove_symtab_node (struct symtab_node *node);
> +
> +  /* Register callgraph and varpool hooks.  */
> +  void register_hooks (void);
> +
> +  /* Unregister callgraph and varpool hooks.  */
> +  void unregister_hooks (void);
> +
> +  /* Adds a CLS to hashtable associated by hash value.  */
> +  void add_class (congruence_class *cls);
> +
> +  /* Gets a congruence class group based on given HASH value.  */
> +  congruence_class_group_t *get_group_by_hash (hashval_t hash);
> +
> +private:
> +
> +  /* Congruence classes are built by hash value.  */
> +  void build_hash_based_classes (void);
> +
> +  /* Semantic items in classes having more than one element and initialized.
> +     In case of WPA, we load function body.  */
> +  void parse_nonsingleton_classes (void);
> +
> +  /* Equality function for semantic items is used to subdivide existing
> +     classes. If IN_WPA, fast equality function is invoked.  */
> +  void subdivide_classes_by_equality (bool in_wpa = false);
> +
> +  /* Debug function prints all informations about congruence classes.  */
> +  void dump_cong_classes (void);
> +
> +  /* Iterative congruence reduction function.  */
> +  void process_cong_reduction (void);
> +
> +  /* After reduction is done, we can declare all items in a group
> +     to be equal. PREV_CLASS_COUNT is start number of classes
> +     before reduction.  */
> +  void merge_classes (unsigned int prev_class_count);
> +
> +  /* Adds a newly created congruence class CLS to worklist.  */
> +  void worklist_push (congruence_class *cls);
> +
> +  /* Pops a class from worklist. */
> +  congruence_class *worklist_pop ();
> +
> +  /* Returns true if a congruence class CLS is presented in worklist.  */
> +  bool worklist_contains (const congruence_class *cls);
> +
> +  /* Removes given congruence class CLS from worklist.  */
> +  void worklist_remove (const congruence_class *cls);
> +
> +  /* Every usage of a congruence class CLS is a candidate that can split the
> +     collection of classes. Bitmap stack BMSTACK is used for bitmap
> +     allocation.  */
> +  void do_congruence_step (congruence_class *cls);
> +
> +  /* Tests if a class CLS used as INDEXth splits any congruence classes.
> +     Bitmap stack BMSTACK is used for bitmap allocation.  */
> +  void do_congruence_step_for_index (congruence_class *cls, unsigned int index);
> +
> +  /* Makes pairing between a congruence class CLS and semantic ITEM.  */
> +  static void add_item_to_class (congruence_class *cls, sem_item *item);
> +
> +  /* Disposes split map traverse function. CLS_PTR is pointer to congruence
> +     class, BSLOT is bitmap slot we want to release. DATA is mandatory,
> +     but unused argument.  */
> +  static bool release_split_map (const void *cls_ptr, bitmap *bslot, void *data);
> +
> +  /* Process split operation for a class given as pointer CLS_PTR,
> +     where bitmap B splits congruence class members. DATA is used
> +     as argument of split pair.  */
> +  static bool traverse_congruence_split (const void *cls_ptr, bitmap *b,
> +					 void *data);
> +
> +  /* Reads a section from LTO stream file FILE_DATA. Input block for DATA
> +     contains LEN bytes.  */
> +  void read_section (struct lto_file_decl_data *file_data, const char *data,
> +		     size_t len);
> +
> +  /* Removes all callgraph and varpool nodes that are marked by symtab
> +     as deleted.  */
> +  void filter_removed_items (void);
> +
> +  /* Vector of semantic items.  */
> +  vec <sem_item *> items;
> +
> +  /* A set containing all items removed by hooks.  */
> +  pointer_set_t *removed_items_set;
> +
> +  /* Hashtable of congruence classes */
> +  hash_table <congruence_class_group_hash> classes;
> +
> +  /* Count of congruence classes.  */
> +  unsigned int classes_count;
> +
> +  /* Map data structure maps trees to semantic items.  */
> +  pointer_map <sem_item *> decl_map;
> +
> +  /* Map data structure maps symtab nodes to semantic items.  */
> +  pointer_map <sem_item *> symtab_node_map;
> +
> +  /* For all congruence classes, we indicate partial mapping
> +     during reduction.  */
> +  pointer_map <bitmap> *split_map;
> +
> +  /* Set to true if a splitter class is removed.  */
> +  bool splitter_class_removed;
> +
> +  /* Global unique class id counter.  */
> +  static unsigned int class_id;
> +
> +  /* Callgraph node removal hook holder.  */
> +  struct cgraph_node_hook_list *cgraph_node_hooks;
> +
> +  /* Varpool node removal hook holder.  */
> +  struct varpool_node_hook_list *varpool_node_hooks;
> +
> +  /* Bitmap stack.  */
> +  bitmap_obstack bmstack;
> +}; // class sem_item_optimizer
> +
> +}
> diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
> index d887763..f9587cf 100644
> --- a/gcc/lto-section-in.c
> +++ b/gcc/lto-section-in.c
> @@ -60,7 +60,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
>    "opts",
>    "cgraphopt",
>    "inline",
> -  "ipcp_trans"
> +  "ipcp_trans",
> +  "icf"
>  };
>  
>  
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index 521d78d..3207d08 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -247,6 +247,7 @@ enum lto_section_type
>    LTO_section_cgraph_opt_sum,
>    LTO_section_inline_summary,
>    LTO_section_ipcp_transform,
> +  LTO_section_ipa_icf,
>    LTO_N_SECTION_TYPES		/* Must be last.  */
>  };
>  
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 2b1280a..b5a58a5 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -496,6 +496,7 @@ static const struct default_options default_options_table[] =
>      { OPT_LEVELS_2_PLUS, OPT_fvect_cost_model_, NULL, VECT_COST_MODEL_CHEAP },
>      { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
> +    { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
>      { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
>  
> diff --git a/gcc/passes.def b/gcc/passes.def
> index f9e0b2a..e629a1d 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -105,6 +105,7 @@ along with GCC; see the file COPYING3.  If not see
>    NEXT_PASS (pass_ipa_whole_program_visibility);
>    NEXT_PASS (pass_ipa_profile);
>    NEXT_PASS (pass_ipa_devirt);
> +  NEXT_PASS (pass_ipa_icf);
>    NEXT_PASS (pass_ipa_cp);
>    NEXT_PASS (pass_ipa_cdtor_merge);
>    NEXT_PASS (pass_ipa_inline);
> diff --git a/gcc/timevar.def b/gcc/timevar.def
> index cbb64d5..c1d09eb 100644
> --- a/gcc/timevar.def
> +++ b/gcc/timevar.def
> @@ -89,6 +89,7 @@ DEFTIMEVAR (TV_WHOPR_LTRANS          , "whopr ltrans")
>  DEFTIMEVAR (TV_IPA_REFERENCE         , "ipa reference")
>  DEFTIMEVAR (TV_IPA_PROFILE           , "ipa profile")
>  DEFTIMEVAR (TV_IPA_PURE_CONST        , "ipa pure const")
> +DEFTIMEVAR (TV_IPA_ICF		     , "ipa icf")
>  DEFTIMEVAR (TV_IPA_PTA               , "ipa points-to")
>  DEFTIMEVAR (TV_IPA_SRA               , "ipa SRA")
>  DEFTIMEVAR (TV_IPA_FREE_LANG_DATA    , "ipa free lang data")
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 3888bb6..d55d7b8 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -464,6 +464,7 @@ extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
>  extern simple_ipa_opt_pass *make_pass_ipa_free_inline_summary (gcc::context
>  							       *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
> +extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
> -- 
> 1.8.4.5
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20140620/fb5147a7/attachment.sig>


More information about the Gcc-patches mailing list