This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fwd: [PATCH] Profile feedback: Use Minimum Cost Flow algorithm to smooth inconsistent edge counts


---------- Forwarded message ----------
From: Vinodha Ramasamy <vinodha@google.com>
Date: Thu, Jul 31, 2008 at 4:35 PM
Subject: Re: [PATCH] Profile feedback: Use Minimum Cost Flow algorithm
to smooth inconsistent edge counts
To: Diego Novillo <dnovillo@google.com>
Cc: Seongbae Park <spark@google.com>, Paul Yuan <yingbo.com@gmail.com>


 Hi Diego,

     I've incorporated your review comments. This patch also fixes the
following bugs introduced by the new code:
1. cancel_negative_cycle() in mcf.c - Need to check for overflow.
Added check d[pfedge->src] != CAP_INFINITY.  Also restrict the number
of iterations of cancel_negative_cycle to MAX_ITER to control
compile-time.
2. read_profile_edge_counts() in profile.c - exec_counts_pos should be
initialized outside the loop.

Please find the patch attached to be submitted. Verified that
bootstrap build and gcc regression tests pass.

Best,
    Vinodha


On Mon, Jul 28, 2008 at 11:08 AM, Vinodha Ramasamy <vinodha@google.com> wrote:
>
> Hi Diego,
>
>    Thanks for the review. I'm incorporated the review feedback. Will send out a revised patch when I'm done.
>
> Best,
>     Vinodha
>
>
> On Fri, Jul 25, 2008 at 10:54 AM, Diego Novillo <dnovillo@google.com> wrote:
>>
>> On 7/8/08 7:47 PM, Vinodha Ramasamy wrote:
>>
>>>   if (node->call_site_hash)
>>>     return (struct cgraph_edge *)
>>> -      htab_find_with_hash (node->call_site_hash, call_stmt,
>>> -                          htab_hash_pointer (call_stmt));
>>> +        htab_find_with_hash (node->call_site_hash, call_stmt,
>>> +                            htab_hash_pointer (call_stmt));
>>
>> This hunk doesn't seem to be changing anything.
>>
>>> +      if (flag_profile_correction)
>>> +        {
>>> +         inform ("%HCorrecting inconsistent value profile: "
>>> +                 "%s profiler count is inconsistent",
>>> +                 locus, name);
>>> +         *all = bb_count;
>>> +         if (*count > *all) *count = *all;
>>
>> The assignment inside the 'if' should go in the next line.
>>
>>> +/* References:
>>> +   [1] "Feedback-directed Optimizations in GCC with Estimated Edge Profiles
>>> +        from Hardware Event Sampling", Vinodha Ramasamy, Paul Yuan, Dehao Chen,
>>> +        and Robert Hundt; GCC Summit 2008.
>>> +   [2] "Complementing Missing and Inaccurate Profiling Using a Minimum Cost
>>> +        Circulation Algorithm", Roy Levin, Ilan Newman and Gadi Haber;
>>> +        HiPEAC '08.
>>> +
>>> +*/
>>
>> Closing comment goes on same line as ending period.
>>
>> Could you also add a high-level description of the algorithm?  Nothing
>> elaborate, but enough detail that one can follow it and it's tied to the
>> functions in the file.
>>
>>> +/* CAP_INFINITY: Constant to represent infinite capacity. */
>>> +#define CAP_INFINITY __LONG_LONG_MAX__
>>> +
>>
>> Two spaces after final '.'.  Happens in a few places in the file.
>>
>>> +typedef struct _queue
>>> +{
>>> +  int *queue;
>>> +  int head;
>>> +  int tail;
>>> +  int size;
>>> +} queue_type;
>>> +
>>> +/* Structure used in the maximal flow routines to find augmenting path. */
>>> +typedef struct _augmenting_path
>>
>> I think that identifiers starting with '_' are reserved, but I am not
>> sure about this.  The informal convention we follow is to call structs
>> 'struct NAME_d' and the typedefs 'NAME' or 'NAME_t'.  It's certainly not
>> followed consistently, though.
>>
>>
>>> +/* Dump routines for debugging. */
>>> +/* Dump edge_f[u,v]. */
>>> +static void dump_fixup_edge (FILE *file, fixup_graph_type *fixup_graph,
>>> +                            fixup_edge_p pfedge);
>>> +static void dump_fixup_graph (FILE *file, fixup_graph_type *fixup_graph,
>>> +                             const char *msg);
>>> +static void print_basic_block (FILE *file, fixup_graph_type *fixup_graph,
>>> +                              int n);
>>> +static void print_edge (FILE *file, fixup_graph_type *fixup_graph, int s,
>>> +                       int d);
>>
>> Unless you need static declarations to break circular references, the
>> convention is to define static functions before their callers.  This
>> convention helps with readability because it tends to cluster the big
>> entry points and high-level functionality at the bottom of the file.
>>
>> Also, comments describing what a function does should go in the function
>> definition, not its declaration.
>>
>>> +
>>> +
>>> +/* Function definitions. */
>>> +/* ln() implementation: approximate calculation. */
>>> +
>>> +static double
>>> +mcf_ln (double x)
>>
>> Missing documentation for X.
>>
>>> +/* sqrt() implementation: based on open source QUAKE3 code by Carmack.
>>> +   Original code is under GPL */
>>> +
>>> +
>>> +static double
>>> +mcf_sqrt (double x)
>>
>> Likewise.  Only one blank line after comment.  No need to specify what
>> type of license the original code is under.  Do you have a more specific
>> reference to the original code?
>>
>>> +  return 0.5f * (convertor.floatPart + (x * convertor2.floatPart));
>>> +}
>>> +
>>> +void
>>> +mcf_smooth_cfg (void)
>>> +{
>>
>> Needs documentation.
>>
>>> +/* Common code to add edge shared between add_fixup_edge and add_rfixup_edge */
>>> +
>>> +static fixup_edge_p
>>> +add_edge (fixup_graph_type *fixup_graph, int src, int dest, gcov_type cost)
>>
>> Likewise.  Also comments should end with '.  */'
>>
>> Several other functions are missing documentation or documentation of
>> their arguments.  I've only marked these two.
>>
>>> +
>>> +/* cancel_negative_cycle: Finds a negative cycle in the residual network using
>>> +   the Bellman-Ford algorithm. The flow on the found cycle is reversed by the
>>> +   minimum residual capacity of that cycle. ENTRY and EXIT vertices are not
>>> +   considered.
>>> +
>>> +Parameters:
>>> +   fixup_graph - Residual graph  (input/output)
>>
>> When referring to arguments and local variables, the convention is to capitalize them.
>>
>>> +
>>> +static void
>>> +enqueue (queue_type *queue_list, int x)
>>> +{
>>> +  gcc_assert (queue_list->tail < queue_list->size);
>>> +  queue_list->queue[queue_list->tail] = x;
>>
>> Would it make sense to implement this queue with VEC()?  It may simplify some of the management.
>>
>>> +
>>> +  FOR_EACH_EDGE (e, ei, to_edges)
>>> +    {
>>> +      sum += e->count;
>>> +    }
>>
>> Braces are not needed here.
>>
>>> +  return sum;
>>> +}
>>> +
>>> +
>>> +/* adjust_cfg_counts: Computes the corrected edge and basic block
>>
>> No need to include the function name in the comment.
>>
>>>
>>> -/* Compute the branch probabilities for the various branches.
>>> -   Annotate them accordingly.  */
>>> +/* Return the sum of count of all edges of the given "edges" set */
>>
>> s/"edges"/EDGES/
>>
>>> +
>>> +static bool
>>> +is_edge_inconsistent (VEC(edge,gc) *edges)
>>
>> Needs comment.
>>
>>
>>
>> I have no comments on the algorithm itself other than it reads well and seems easy to follow.
>>
>> The patch looks fine to me with the revisions I suggested.
>>
>>
>> Thanks.  Diego.
>
Index: cgraph.c
===================================================================
--- cgraph.c	(revision 138218)
+++ cgraph.c	(working copy)
@@ -516,7 +516,7 @@ cgraph_edge (struct cgraph_node *node, g
   if (node->call_site_hash)
     return (struct cgraph_edge *)
       htab_find_with_hash (node->call_site_hash, call_stmt,
-			   htab_hash_pointer (call_stmt));
+      	                   htab_hash_pointer (call_stmt));
 
   /* This loop may turn out to be performance problem.  In such case adding
      hashtables into call nodes with very many edges is probably best
@@ -1208,7 +1208,12 @@ cgraph_clone_node (struct cgraph_node *n
   new->master_clone = n->master_clone;
   new->count = count;
   if (n->count)
-    count_scale = new->count * REG_BR_PROB_BASE / n->count;
+    {
+      if (new->count > n->count)
+        count_scale = REG_BR_PROB_BASE;
+      else
+        count_scale = new->count * REG_BR_PROB_BASE / n->count;
+    }
   else
     count_scale = 0;
   if (update_original)
Index: value-prof.c
===================================================================
--- value-prof.c	(revision 138218)
+++ value-prof.c	(working copy)
@@ -453,18 +453,32 @@ free_histograms (void)
    somehow.  */
 
 static bool
-check_counter (gimple stmt, const char *name, gcov_type all, gcov_type bb_count)
+check_counter (gimple stmt, const char * name,
+	       gcov_type *count, gcov_type *all, gcov_type bb_count)
 {
-  if (all != bb_count)
+  if (*all != bb_count || *count > *all)
     {
       location_t locus;
       locus = (stmt != NULL)
-	       ? gimple_location (stmt)
-	       : DECL_SOURCE_LOCATION (current_function_decl);
-      error ("%HCorrupted value profile: %s profiler overall count (%d) "
-	     "does not match BB count (%d)", &locus, name, (int)all,
-	     (int)bb_count);
-      return true;
+              ? gimple_location (stmt)
+              : DECL_SOURCE_LOCATION (current_function_decl);
+      if (flag_profile_correction)
+        {
+	  inform ("%HCorrecting inconsistent value profile: "
+		  "%s profiler overall count (%d) does not match BB count "
+                  "(%d)", &locus, name, (int)all, (int)bb_count);
+	  *all = bb_count;
+	  if (*count > *all)
+            *count = *all;
+	  return false;
+	}
+      else
+	{
+	  error ("%HCorrupted value profile: %s profiler overall count (%d) "
+                 "does not match BB count (%d)", &locus, name, (int)all,
+                 (int)bb_count);
+	  return true;
+	}
     }
 
   return false;
@@ -664,7 +678,7 @@ gimple_divmod_fixed_value_transform (gim
       || !maybe_hot_bb_p (gimple_bb (stmt)))
     return false;
 
-  if (check_counter (stmt, "value", all, gimple_bb (stmt)->count))
+  if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
     return false;
 
   /* Compute probability of taking the optimal path.  */
@@ -831,7 +845,7 @@ gimple_mod_pow2_value_transform (gimple_
   /* Compute probability of taking the optimal path.  */
   all = count + wrong_values;
 
-  if (check_counter (stmt, "pow2", all, gimple_bb (stmt)->count))
+  if (check_counter (stmt, "pow2", &count, &all, gimple_bb (stmt)->count))
     return false;
 
   if (all > 0)
@@ -1006,12 +1020,17 @@ gimple_mod_subtract_transform (gimple_st
   count2 = histogram->hvalue.counters[1];
 
   /* Compute probability of taking the optimal path.  */
-  if (check_counter (stmt, "interval", all, gimple_bb (stmt)->count))
+  if (check_counter (stmt, "interval", &count1, &all, gimple_bb (stmt)->count))
     {
       gimple_remove_histogram_value (cfun, stmt, histogram);
       return false;
     }
 
+  if (flag_profile_correction && count1 + count2 > all)
+      all = count1 + count2;
+
+  gcc_assert (count1 + count2 <= all);
+
   /* We require that we use just subtractions in at least 50% of all
      evaluations.  */
   count = 0;
@@ -1192,7 +1211,7 @@ static bool
 gimple_ic_transform (gimple stmt)
 {
   histogram_value histogram;
-  gcov_type val, count, all;
+  gcov_type val, count, all, bb_all;
   gcov_type prob;
   tree callee;
   gimple modify;
@@ -1218,6 +1237,14 @@ gimple_ic_transform (gimple stmt)
   if (4 * count <= 3 * all)
     return false;
 
+  bb_all = gimple_bb (stmt)->count;
+  /* The order of CHECK_COUNTER calls is important - 
+     since check_counter can correct the third parameter
+     and we want to make count <= all <= bb_all. */
+  if ( check_counter (stmt, "ic", &all, &bb_all, bb_all)
+      || check_counter (stmt, "ic", &count, &all, all))
+    return false;
+
   if (all > 0)
     prob = (count * REG_BR_PROB_BASE + all / 2) / all;
   else
@@ -1411,7 +1438,7 @@ gimple_stringops_transform (gimple_stmt_
      at least 80% of time.  */
   if ((6 * count / 5) < all || !maybe_hot_bb_p (gimple_bb (stmt)))
     return false;
-  if (check_counter (stmt, "value", all, gimple_bb (stmt)->count))
+  if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
     return false;
   if (all > 0)
     prob = (count * REG_BR_PROB_BASE + all / 2) / all;
Index: mcf.c
===================================================================
--- mcf.c	(revision 0)
+++ mcf.c	(revision 0)
@@ -0,0 +1,1401 @@
+/* Routines to implement minimum-cost maximal flow algorithm used to smooth
+   basic block and edge frequency counts.
+   Copyright (C) 2008
+   Free Software Foundation, Inc.
+   Contributed by Paul Yuan (yingbo.com@gmail.com) and
+                  Vinodha Ramasamy (vinodha@google.com).
+
+This file is part of GCC.
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* References:
+   [1] "Feedback-directed Optimizations in GCC with Estimated Edge Profiles
+        from Hardware Event Sampling", Vinodha Ramasamy, Paul Yuan, Dehao Chen,
+        and Robert Hundt; GCC Summit 2008.
+   [2] "Complementing Missing and Inaccurate Profiling Using a Minimum Cost
+        Circulation Algorithm", Roy Levin, Ilan Newman and Gadi Haber;
+        HiPEAC '08.
+
+   Algorithm to smooth basic block and edge counts:
+   1. create_fixup_graph: Create fixup graph by translating function CFG into
+      a graph that satisfies MCF algorithm requirements.
+   2. find_max_flow: Find maximal flow.
+   3. compute_residual_flow: Form residual network.
+   4. Repeat:
+      cancel_negative_cycle: While G contains a negative cost cycle C, reverse
+      the flow on the found cycle by the minimum residual capacity in that
+      cycle.
+   5. Form the minimal cost flow
+      f(u,v) = rf(v, u).
+   6. adjust_cfg_counts: Update initial edge weights with corrected weights.
+      delta(u.v) = f(u,v) -f(v,u).
+      w*(u,v) = w(u,v) + delta(u,v).  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "basic-block.h"
+#include "output.h"
+#include "langhooks.h"
+#include "tree.h"
+#include "gcov-io.h"
+
+#include "profile.h"
+
+/* CAP_INFINITY: Constant to represent infinite capacity.  */
+#define CAP_INFINITY __LONG_LONG_MAX__
+
+/* COST FUNCTION.  */
+#define K_POS(b)        ((b))
+#define K_NEG(b)        (50 * (b))
+#define COST(k, w)      ((k) / mcf_ln ((w) + 2))
+/* Limit the number of iterations for cancel_negative_cycles() to ensure
+   reasonable compile time.  */
+#define MAX_ITER(n, e)  10 + (1000000 / ((n) * (e)))
+typedef enum
+{
+  INVALID_EDGE,
+  VERTEX_SPLIT_EDGE,	    /* Edge to represent vertex with w(e) = w(v).  */
+  REDIRECT_EDGE,	    /* Edge after vertex transformation.  */
+  REVERSE_EDGE,
+  SOURCE_CONNECT_EDGE,	    /* Single edge connecting to single source.  */
+  SINK_CONNECT_EDGE,	    /* Single edge connecting to single sink.  */
+  BALANCE_EDGE,		    /* Edge connecting with source/sink: cp(e) = 0.  */
+  REDIRECT_NORMALIZED_EDGE, /* Normalized edge for a redirect edge.  */
+  REVERSE_NORMALIZED_EDGE   /* Normalized edge for a reverse edge.  */
+} edge_type;
+
+/* Structure to represent an edge in the fixup graph.  */
+typedef struct fixup_edge_d
+{
+  int src;
+  int dest;
+  /* Flag denoting type of edge and attributes for the flow field.  */
+  edge_type type;
+  bool is_rflow_valid;
+  /* Index to the normalization vertex added for this edge.  */
+  int norm_vertex_index;
+  /* Flow for this edge.  */
+  gcov_type flow;
+  /* Residual flow for this edge - used during negative cycle canceling.  */
+  gcov_type rflow;
+  gcov_type weight;
+  gcov_type cost;
+  gcov_type max_capacity;
+} fixup_edge_type;
+
+typedef fixup_edge_type *fixup_edge_p;
+
+DEF_VEC_P (fixup_edge_p);
+DEF_VEC_ALLOC_P (fixup_edge_p, heap);
+
+/* Structure to represent a vertex in the fixup graph.  */
+typedef struct fixup_vertex_d
+{
+  VEC (fixup_edge_p, heap) *succ_edges;
+} fixup_vertex_type;
+
+typedef fixup_vertex_type *fixup_vertex_p;
+
+/* Fixup graph used in the MCF algorithm.  */
+typedef struct fixup_graph_d
+{
+  /* Current number of vertices for the graph.  */
+  int num_vertices;
+  /* Current number of edges for the graph.  */
+  int num_edges;
+  /* Index of new entry vertex.  */
+  int new_entry_index;
+  /* Index of new exit vertex.  */
+  int new_exit_index;
+  /* Fixup vertex list. Adjacency list for fixup graph.  */
+  fixup_vertex_p vertex_list;
+  /* Fixup edge list.  */
+  fixup_edge_p edge_list;
+} fixup_graph_type;
+
+typedef struct queue_d
+{
+  int *queue;
+  int head;
+  int tail;
+  int size;
+} queue_type;
+
+/* Structure used in the maximal flow routines to find augmenting path.  */
+typedef struct augmenting_path_d
+{
+  /* Queue used to hold vertex indices.  */
+  queue_type queue_list;
+  /* Vector to hold chain of pred vertex indices in augmenting path.  */
+  int *bb_pred;
+  /* Vector that indicates if basic block i has been visited.  */
+  int *is_visited;
+} augmenting_path_type;
+
+
+/* Function definitions.  */
+
+/* Dump routines to aid debugging.  */
+
+/* Print basic block with index N for FIXUP_GRAPH in n' and n'' format.  */
+
+static void
+print_basic_block (FILE *file, fixup_graph_type *fixup_graph, int n)
+{
+  if (n == ENTRY_BLOCK)
+    fputs ("ENTRY", file);
+  else if (n == ENTRY_BLOCK + 1)
+    fputs ("ENTRY''", file);
+  else if (n == 2 * EXIT_BLOCK)
+    fputs ("EXIT", file);
+  else if (n == 2 * EXIT_BLOCK + 1)
+    fputs ("EXIT''", file);
+  else if (n == fixup_graph->new_exit_index)
+    fputs ("NEW_EXIT", file);
+  else if (n == fixup_graph->new_entry_index)
+    fputs ("NEW_ENTRY", file);
+  else
+    {
+      fprintf (file, "%d", n / 2);
+      if (n % 2)
+	fputs ("''", file);
+      else
+	fputs ("'", file);
+    }
+}
+
+
+/* Print edge S->D for given fixup_graph with n' and n'' format.
+   PARAMETERS:
+   S is the index of the source vertex of the edge (input) and
+   D is the index of the destination vertex of the edge (input) for the given
+   fixup_graph (input).  */
+
+static void
+print_edge (FILE *file, fixup_graph_type *fixup_graph, int s, int d)
+{
+  print_basic_block (file, fixup_graph, s);
+  fputs ("->", file);
+  print_basic_block (file, fixup_graph, d);
+}
+
+
+/* Dump out the attributes of a given edge FEDGE in the fixup_graph to a
+   file.  */
+static void
+dump_fixup_edge (FILE *file, fixup_graph_type *fixup_graph, fixup_edge_p fedge)
+{
+  if (!fedge)
+    {
+      fputs ("NULL fixup graph edge.\n", file);
+      return;
+    }
+
+  print_edge (file, fixup_graph, fedge->src, fedge->dest);
+  fputs (": ", file);
+
+  if (fedge->type)
+    {
+      fprintf (file, "flow/capacity=" HOST_WIDEST_INT_PRINT_DEC "/",
+	       fedge->flow);
+      if (fedge->max_capacity == CAP_INFINITY)
+	fputs ("+oo,", file);
+      else
+	fprintf (file, "" HOST_WIDEST_INT_PRINT_DEC ",", fedge->max_capacity);
+    }
+
+  if (fedge->is_rflow_valid)
+    {
+      if (fedge->rflow == CAP_INFINITY)
+	fputs (" rflow=+oo.", file);
+      else
+	fprintf (file, " rflow=" HOST_WIDEST_INT_PRINT_DEC ",", fedge->rflow);
+    }
+
+  fprintf (file, " cost=" HOST_WIDEST_INT_PRINT_DEC ".", fedge->cost);
+
+  fprintf (file, "\t(%d->%d)", fedge->src, fedge->dest);
+
+  if (fedge->type)
+    {
+      switch (fedge->type)
+	{
+	case VERTEX_SPLIT_EDGE:
+	  fputs (" @VERTEX_SPLIT_EDGE", file);
+	  break;
+
+	case REDIRECT_EDGE:
+	  fputs (" @REDIRECT_EDGE", file);
+	  break;
+
+	case SOURCE_CONNECT_EDGE:
+	  fputs (" @SOURCE_CONNECT_EDGE", file);
+	  break;
+
+	case SINK_CONNECT_EDGE:
+	  fputs (" @SINK_CONNECT_EDGE", file);
+	  break;
+
+	case REVERSE_EDGE:
+	  fputs (" @REVERSE_EDGE", file);
+	  break;
+
+	case BALANCE_EDGE:
+	  fputs (" @BALANCE_EDGE", file);
+	  break;
+
+	case REDIRECT_NORMALIZED_EDGE:
+	case REVERSE_NORMALIZED_EDGE:
+	  fputs ("  @NORMALIZED_EDGE", file);
+	  break;
+
+	default:
+	  fputs (" @INVALID_EDGE", file);
+	  break;
+	}
+    }
+  fputs ("\n", file);
+}
+
+
+/* Print out the edges and vertices of the given FIXUP_GRAPH, into the dump
+   file. The input string MSG is printed out as a heading.  */
+
+static void
+dump_fixup_graph (FILE *file, fixup_graph_type *fixup_graph, const char *msg)
+{
+  int i, j;
+  int fnum_vertices, fnum_edges;
+
+  fixup_vertex_p fvertex_list, pfvertex;
+  fixup_edge_p pfedge;
+
+  gcc_assert (fixup_graph);
+  fvertex_list = fixup_graph->vertex_list;
+  fnum_vertices = fixup_graph->num_vertices;
+  fnum_edges = fixup_graph->num_edges;
+
+  fprintf (file, "\nDump fixup graph for %s(): %s.\n",
+	   lang_hooks.decl_printable_name (current_function_decl, 2), msg);
+  fprintf (file,
+	   "There are %d vertices and %d edges. new_exit_index is %d.\n\n",
+	   fnum_vertices, fnum_edges, fixup_graph->new_exit_index);
+
+  for (i = 0; i < fnum_vertices; i++)
+    {
+      pfvertex = fvertex_list + i;
+      fprintf (file, "vertex_list[%d]: %d succ fixup edges.\n",
+	       i, VEC_length (fixup_edge_p, pfvertex->succ_edges));
+
+      for (j = 0; VEC_iterate (fixup_edge_p, pfvertex->succ_edges, j, pfedge);
+	   j++)
+	{
+	  /* Distinguish forward edges and backward edges in the residual flow
+             network.  */
+	  if (pfedge->type)
+	    fputs ("(f) ", file);
+	  else if (pfedge->is_rflow_valid)
+	    fputs ("(b) ", file);
+	  dump_fixup_edge (file, fixup_graph, pfedge);
+	}
+    }
+
+  fputs ("\n", file);
+}
+
+
+/* Utility routines.  */
+/* ln() implementation: approximate calculation. Returns ln of X.  */
+
+static double
+mcf_ln (double x)
+{
+#define E       2.71828
+  int l = 1;
+  double m = E;
+
+  gcc_assert (x >= 0);
+
+  while (m < x)
+    {
+      m *= E;
+      l++;
+    }
+
+  return l;
+}
+
+
+/* sqrt() implementation: based on open source QUAKE3 code (magic sqrt
+   implementation) by John Carmack.  Returns sqrt of X.  */
+
+static double
+mcf_sqrt (double x)
+{
+#define MAGIC_CONST1    0x1fbcf800
+#define MAGIC_CONST2    0x5f3759df
+  union {
+    int intPart;
+    float floatPart;
+  } convertor, convertor2;
+
+  gcc_assert (x >= 0);
+
+  convertor.floatPart = x;
+  convertor2.floatPart = x;
+  convertor.intPart = MAGIC_CONST1 + (convertor.intPart >> 1);
+  convertor2.intPart = MAGIC_CONST2 - (convertor2.intPart >> 1);
+
+  return 0.5f * (convertor.floatPart + (x * convertor2.floatPart));
+}
+
+
+/* Common code shared between add_fixup_edge and add_rfixup_edge. Adds an edge
+   (SRC->DEST) to the edge_list maintained in FIXUP_GRAPH with cost of the edge
+   added set to COST.  */
+
+static fixup_edge_p
+add_edge (fixup_graph_type *fixup_graph, int src, int dest, gcov_type cost)
+{
+  fixup_vertex_p curr_vertex = fixup_graph->vertex_list + src;
+  fixup_edge_p curr_edge = fixup_graph->edge_list + fixup_graph->num_edges;
+  curr_edge->src = src;
+  curr_edge->dest = dest;
+  curr_edge->cost = cost;
+  fixup_graph->num_edges++;
+  if (dump_file)
+    dump_fixup_edge (dump_file, fixup_graph, curr_edge);
+  VEC_safe_push (fixup_edge_p, heap, curr_vertex->succ_edges, curr_edge);
+  return curr_edge;
+}
+
+
+/* Add a fixup edge (src->dest) with attributes TYPE, WEIGHT, COST and
+   MAX_CAPACITY to the edge_list in the fixup graph.  */
+
+static void
+add_fixup_edge (fixup_graph_type *fixup_graph, int src, int dest, int type,
+		gcov_type weight, gcov_type cost, gcov_type max_capacity)
+{
+  fixup_edge_p curr_edge = add_edge(fixup_graph, src, dest, cost);
+  curr_edge->type = type;
+  curr_edge->weight = weight;
+  curr_edge->max_capacity = max_capacity;
+}
+
+
+/* Add a residual edge (SRC->DEST) with attributes RFLOW and COST
+   to the fixup graph.  */
+
+static void
+add_rfixup_edge (fixup_graph_type *fixup_graph, int src, int dest,
+		 gcov_type rflow, gcov_type cost)
+{
+  fixup_edge_p curr_edge = add_edge (fixup_graph, src, dest, cost);
+  curr_edge->rflow = rflow;
+  curr_edge->is_rflow_valid = true;
+  /* This edge is not a valid edge - merely used to hold residual flow.  */
+  curr_edge->type = INVALID_EDGE;
+}
+
+
+/* Return the pointer to fixup edge SRC->DEST or NULL if edge does not
+   exist in the FIXUP_GRAPH.  */
+
+static fixup_edge_p
+find_fixup_edge (fixup_graph_type *fixup_graph, int src, int dest)
+{
+  int j;
+  fixup_edge_p pfedge;
+  fixup_vertex_p pfvertex;
+
+  gcc_assert (src < fixup_graph->num_vertices);
+
+  pfvertex = fixup_graph->vertex_list + src;
+
+  for (j = 0; VEC_iterate (fixup_edge_p, pfvertex->succ_edges, j, pfedge);
+       j++)
+    if (pfedge->dest == dest)
+      return pfedge;
+
+  return NULL;
+}
+
+
+/* Cleanup routine to free structures in FIXUP_GRAPH.  */
+
+static void
+delete_fixup_graph (fixup_graph_type *fixup_graph)
+{
+  int i;
+  int fnum_vertices = fixup_graph->num_vertices;
+  fixup_vertex_p pfvertex = fixup_graph->vertex_list;
+
+  for (i = 0; i < fnum_vertices; i++, pfvertex++)
+    VEC_free (fixup_edge_p, heap, pfvertex->succ_edges);
+
+  free (fixup_graph->vertex_list);
+  free (fixup_graph->edge_list);
+}
+
+
+/* Creates a fixup graph FIXUP_GRAPH from the function CFG.  */
+
+static void
+create_fixup_graph (fixup_graph_type *fixup_graph)
+{
+  double sqrt_avg_vertex_weight = 0;
+  double total_vertex_weight = 0;
+  double k_pos = 0;
+  double k_neg = 0;
+  /* Vector to hold D(v) = sum_out_edges(v) - sum_in_edges(v).  */
+  gcov_type *diff_out_in = NULL;
+  gcov_type supply_value = 1, demand_value = 0;
+  gcov_type fcost = 0;
+  int new_entry_index = 0, new_exit_index = 0;
+  int i = 0, j = 0;
+  int new_index = 0;
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  fixup_edge_p pfedge, r_pfedge;
+  fixup_edge_p fedge_list;
+  int fnum_edges;
+
+  /* Each basic_block will be split into 2 during vertex transformation.  */
+  int fnum_vertices_after_transform =  2 * n_basic_blocks;
+  int fnum_edges_after_transform = n_edges + n_basic_blocks;
+
+  /* Count the new SOURCE and EXIT vertices to be added.  */
+  int fmax_num_vertices =
+    fnum_vertices_after_transform + n_edges + n_basic_blocks + 2;
+
+  /* In create_fixup_graph: Each basic block and edge can be split into 3
+     edges. Number of balance edges = n_basic_blocks. So after
+     create_fixup_graph:
+     max_edges = 4 * n_basic_blocks + 3 * n_edges
+     Accounting for residual flow edges
+     max_edges = 2 * (4 * n_basic_blocks + 3 * n_edges)
+     = 8 * n_basic_blocks + 6 * n_edges
+     < 8 * n_basic_blocks + 8 * n_edges.  */
+  int fmax_num_edges = 8 * (n_basic_blocks + n_edges);
+
+  /* Initial num of vertices in the fixup graph.  */
+  fixup_graph->num_vertices = n_basic_blocks;
+
+  /* Fixup graph vertex list.  */
+  fixup_graph->vertex_list =
+    (fixup_vertex_p) xcalloc (fmax_num_vertices, sizeof (fixup_vertex_type));
+
+  /* Fixup graph edge list.  */
+  fixup_graph->edge_list =
+    (fixup_edge_p) xcalloc (fmax_num_edges, sizeof (fixup_edge_type));
+
+  diff_out_in =
+    (gcov_type *) xcalloc (1 + fnum_vertices_after_transform,
+			   sizeof (gcov_type));
+
+  /* Compute constants b, k_pos, k_neg used in the cost function calculation.
+     b = sqrt(avg_vertex_weight(cfg)); k_pos = b; k_neg = 50b.  */
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
+    total_vertex_weight += bb->count;
+
+  sqrt_avg_vertex_weight = mcf_sqrt (total_vertex_weight / n_basic_blocks);
+
+  k_pos = K_POS (sqrt_avg_vertex_weight);
+  k_neg = K_NEG (sqrt_avg_vertex_weight);
+
+  /* 1. Vertex Transformation: Split each vertex v into two vertices v' and v'',
+     connected by an edge e from v' to v''. w(e) = w(v).  */
+
+  if (dump_file)
+    fprintf (dump_file, "\nVertex transformation:\n");
+
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
+  {
+    /* v'->v'': index1->(index1+1).  */
+    i = 2 * bb->index;
+    fcost = (gcov_type) COST (k_pos, bb->count);
+    add_fixup_edge (fixup_graph, i, i + 1, VERTEX_SPLIT_EDGE, bb->count,
+                    fcost, CAP_INFINITY);
+    fixup_graph->num_vertices++;
+
+    FOR_EACH_EDGE (e, ei, bb->succs)
+    {
+      /* Edges with ignore attribute set should be treated like they don't
+         exist.  */
+      if (EDGE_INFO (e) && EDGE_INFO (e)->ignore)
+        continue;
+      j = 2 * e->dest->index;
+      fcost = (gcov_type) COST (k_pos, e->count);
+      add_fixup_edge (fixup_graph, i + 1, j, REDIRECT_EDGE, e->count, fcost,
+                      CAP_INFINITY);
+    }
+  }
+
+  /* After vertex transformation.  */
+  gcc_assert (fixup_graph->num_vertices == fnum_vertices_after_transform);
+  /* Redirect edges are not added for edges with ignore attribute.  */
+  gcc_assert (fixup_graph->num_edges <= fnum_edges_after_transform);
+
+  fnum_edges_after_transform = fixup_graph->num_edges;
+
+  /* 2. Initialize D(v).  */
+  for (i = 0; i < fnum_edges_after_transform; i++)
+    {
+      pfedge = fixup_graph->edge_list + i;
+      diff_out_in[pfedge->src] += pfedge->weight;
+      diff_out_in[pfedge->dest] -= pfedge->weight;
+    }
+
+  /* Entry block - vertex indices 0, 1; EXIT block - vertex indices 2, 3.  */
+  for (i = 0; i <= 3; i++)
+    diff_out_in[i] = 0;
+
+  /* 3. Add reverse edges: needed to decrease counts during smoothing.  */
+  if (dump_file)
+    fprintf (dump_file, "\nReverse edges:\n");
+  for (i = 0; i < fnum_edges_after_transform; i++)
+    {
+      pfedge = fixup_graph->edge_list + i;
+      if ((pfedge->src == 0) || (pfedge->src == 2))
+        continue;
+      r_pfedge = find_fixup_edge (fixup_graph, pfedge->dest, pfedge->src);
+      if (!r_pfedge && pfedge->weight)
+	{
+	  /* Skip adding reverse edges for edges with w(e) = 0, as its maximum
+	     capacity is 0.  */
+	  fcost = (gcov_type) COST (k_neg, pfedge->weight);
+	  add_fixup_edge (fixup_graph, pfedge->dest, pfedge->src,
+			  REVERSE_EDGE, 0, fcost, pfedge->weight);
+	}
+    }
+
+  /* 4. Create single source and sink. Connect new source vertex s' to function
+     entry block. Connect sink vertex t' to function exit.  */
+  if (dump_file)
+    fprintf (dump_file, "\ns'->S, T->t':\n");
+
+  new_entry_index = fixup_graph->new_entry_index = fixup_graph->num_vertices;
+  fixup_graph->num_vertices++;
+  /* Set supply_value to 1 to avoid zero count function ENTRY.  */
+  add_fixup_edge (fixup_graph, new_entry_index, ENTRY_BLOCK, SOURCE_CONNECT_EDGE,
+		  1 /* supply_value */, 0, 1 /* supply_value */);
+
+  /* Create new exit with EXIT_BLOCK as single pred.  */
+  new_exit_index = fixup_graph->new_exit_index = fixup_graph->num_vertices;
+  fixup_graph->num_vertices++;
+  add_fixup_edge (fixup_graph, 2 * EXIT_BLOCK + 1, new_exit_index,
+                  SINK_CONNECT_EDGE,
+                  0 /* demand_value */, 0, 0 /* demand_value */);
+
+  /* Connect vertices with unbalanced D(v) to source/sink.  */
+  if (dump_file)
+    fprintf (dump_file, "\nD(v) balance:\n");
+  /* Skip vertices for ENTRY (0, 1) and EXIT (2,3) blocks, so start with i = 4.
+     diff_out_in[v''] will be 0, so skip v'' vertices, hence i += 2.  */
+  for (i = 4; i < new_entry_index; i += 2)
+    {
+      if (diff_out_in[i] > 0)
+	{
+	  add_fixup_edge (fixup_graph, i, new_exit_index, BALANCE_EDGE, 0, 0,
+			  diff_out_in[i]);
+	  demand_value += diff_out_in[i];
+	}
+      else if (diff_out_in[i] < 0)
+	{
+	  add_fixup_edge (fixup_graph, new_entry_index, i, BALANCE_EDGE, 0, 0,
+			  -diff_out_in[i]);
+	  supply_value -= diff_out_in[i];
+	}
+    }
+
+  /* Set supply = demand.  */
+  if (dump_file)
+    {
+      fprintf (dump_file, "\nAdjust supply and demand:\n");
+      fprintf (dump_file, "supply_value=" HOST_WIDEST_INT_PRINT_DEC "\n",
+	       supply_value);
+      fprintf (dump_file, "demand_value=" HOST_WIDEST_INT_PRINT_DEC "\n",
+	       demand_value);
+    }
+
+  if (demand_value > supply_value)
+    {
+      pfedge = find_fixup_edge (fixup_graph, new_entry_index, ENTRY_BLOCK);
+      pfedge->max_capacity += (demand_value - supply_value);
+    }
+  else
+    {
+      pfedge = find_fixup_edge (fixup_graph, 2 * EXIT_BLOCK + 1, new_exit_index);
+      pfedge->max_capacity += (supply_value - demand_value);
+    }
+
+  /* 6. Normalize edges: remove anti-parallel edges. Anti-parallel edges are
+     created by the vertex transformation step from self-edges in the original
+     CFG and by the reverse edges added earlier.  */
+  if (dump_file)
+    fprintf (dump_file, "\nNormalize edges:\n");
+
+  fnum_edges = fixup_graph->num_edges;
+  fedge_list = fixup_graph->edge_list;
+
+  for (i = 0; i < fnum_edges; i++)
+    {
+      pfedge = fedge_list + i;
+      r_pfedge = find_fixup_edge (fixup_graph, pfedge->dest, pfedge->src);
+      if (((pfedge->type == VERTEX_SPLIT_EDGE)
+	   || (pfedge->type == REDIRECT_EDGE)) && r_pfedge)
+	{
+	  new_index = fixup_graph->num_vertices;
+	  fixup_graph->num_vertices++;
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "\nAnti-parallel edge:\n");
+	      dump_fixup_edge (dump_file, fixup_graph, pfedge);
+	      dump_fixup_edge (dump_file, fixup_graph, r_pfedge);
+	      fprintf (dump_file, "New vertex is %d.\n", new_index);
+	      fprintf (dump_file, "------------------\n");
+	    }
+
+	  pfedge->cost /= 2;
+	  pfedge->norm_vertex_index = new_index;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "After normalization:\n");
+	      dump_fixup_edge (dump_file, fixup_graph, pfedge);
+	    }
+
+	  /* Add a new fixup edge: new_index->src.  */
+	  add_fixup_edge (fixup_graph, new_index, pfedge->src,
+			  REVERSE_NORMALIZED_EDGE, 0, r_pfedge->cost,
+			  r_pfedge->max_capacity);
+	  gcc_assert (fixup_graph->num_vertices <= fmax_num_vertices);
+
+	  /* Edge: r_pfedge->src -> r_pfedge->dest
+             ==> r_pfedge->src -> new_index.  */
+	  r_pfedge->dest = new_index;
+	  r_pfedge->type = REVERSE_NORMALIZED_EDGE;
+	  r_pfedge->cost = pfedge->cost;
+	  r_pfedge->max_capacity = pfedge->max_capacity;
+	  if (dump_file)
+	    dump_fixup_edge (dump_file, fixup_graph, r_pfedge);
+	}
+    }
+
+  if (dump_file)
+    dump_fixup_graph (dump_file, fixup_graph, "After create_fixup_graph()");
+
+  /* Cleanup.  */
+  free (diff_out_in);
+}
+
+
+/* Allocates space for the structures in AUGMENTING_PATH.  The space needed is
+   proportional to the number of nodes in the graph, which is given by
+   GRAPH_SIZE.  */
+
+static void
+init_augmenting_path (augmenting_path_type *augmenting_path, int graph_size)
+{
+  augmenting_path->queue_list.queue = (int *)
+    xcalloc (graph_size + 2, sizeof (int));
+  augmenting_path->queue_list.size = graph_size + 2;
+  augmenting_path->bb_pred = (int *) xcalloc (graph_size, sizeof (int));
+  augmenting_path->is_visited = (int *) xcalloc (graph_size, sizeof (int));
+}
+
+/* Free the structures in AUGMENTING_PATH.  */
+static void
+free_augmenting_path (augmenting_path_type *augmenting_path)
+{
+  free (augmenting_path->queue_list.queue);
+  free (augmenting_path->bb_pred);
+  free (augmenting_path->is_visited);
+}
+
+
+/* Queue routines. Assumes queue will never overflow.  */
+
+static void
+init_queue (queue_type *queue_list)
+{
+  gcc_assert (queue_list);
+  queue_list->head = 0;
+  queue_list->tail = 0;
+}
+
+/* Return true if QUEUE_LIST is empty.  */
+static bool
+is_empty (queue_type *queue_list)
+{
+  return (queue_list->head == queue_list->tail);
+}
+
+/* Insert element X into QUEUE_LIST.  */
+static void
+enqueue (queue_type *queue_list, int x)
+{
+  gcc_assert (queue_list->tail < queue_list->size);
+  queue_list->queue[queue_list->tail] = x;
+  (queue_list->tail)++;
+}
+
+/* Return the first element in QUEUE_LIST.  */
+static int
+dequeue (queue_type *queue_list)
+{
+  int x;
+  gcc_assert (queue_list->head >= 0);
+  x = queue_list->queue[queue_list->head];
+  (queue_list->head)++;
+  return x;
+}
+
+
+/* Finds a negative cycle in the residual network using
+   the Bellman-Ford algorithm. The flow on the found cycle is reversed by the
+   minimum residual capacity of that cycle. ENTRY and EXIT vertices are not
+   considered.
+
+Parameters:
+   FIXUP_GRAPH - Residual graph  (input/output)
+   The following are allocated/freed by the caller:
+   PI - Vector to hold predecessors in path  (pi = pred index)
+   D - D[I] holds minimum cost of path from i to sink
+   CYCLE - Vector to hold the minimum cost cycle
+
+Return:
+   true if a negative cycle was found, false otherwise.  */
+
+static bool
+cancel_negative_cycle (fixup_graph_type *fixup_graph,
+		       int *pi, gcov_type *d, int *cycle)
+{
+  int i, j, k;
+  int fnum_vertices, fnum_edges;
+  fixup_edge_p fedge_list, pfedge, r_pfedge;
+  bool found_cycle = false;
+  int cycle_start = 0, cycle_end = 0;
+  gcov_type sum_cost = 0, cycle_flow = 0;
+  int new_entry_index;
+  bool propagated = false;
+
+  gcc_assert (fixup_graph);
+  fnum_vertices = fixup_graph->num_vertices;
+  fnum_edges = fixup_graph->num_edges;
+  fedge_list = fixup_graph->edge_list;
+  new_entry_index = fixup_graph->new_entry_index;
+
+  /* Initialize.  */
+  /* Skip ENTRY.  */
+  for (i = 1; i < fnum_vertices; i++)
+    {
+      d[i] = CAP_INFINITY;
+      pi[i] = -1;
+      cycle[i] = -1;
+    }
+  d[ENTRY_BLOCK] = 0;
+
+  /* Relax.  */
+  for (k = 1; k < fnum_vertices; k++)
+  {
+    propagated = false;
+    for (i = 0; i < fnum_edges; i++)
+      {
+	pfedge = fedge_list + i;
+	if (pfedge->src == new_entry_index)
+	  continue;
+	if (pfedge->is_rflow_valid && pfedge->rflow
+            && d[pfedge->src] != CAP_INFINITY
+	    && (d[pfedge->dest] > d[pfedge->src] + pfedge->cost))
+	  {
+	    d[pfedge->dest] = d[pfedge->src] + pfedge->cost;
+	    pi[pfedge->dest] = pfedge->src;
+            propagated = true;
+	  }
+      }
+    if (!propagated)
+      break;
+  }
+
+  if (!propagated)
+  /* No negative cycles exist.  */
+    return 0;
+
+  /* Detect.  */
+  for (i = 0; i < fnum_edges; i++)
+    {
+      pfedge = fedge_list + i;
+      if (pfedge->src == new_entry_index)
+	continue;
+      if (pfedge->is_rflow_valid && pfedge->rflow
+          && d[pfedge->src] != CAP_INFINITY
+	  && (d[pfedge->dest] > d[pfedge->src] + pfedge->cost))
+	{
+	  found_cycle = true;
+	  break;
+	}
+    }
+
+  if (!found_cycle)
+    return 0;
+
+  /* Augment the cycle with the cycle's minimum residual capacity.  */
+  found_cycle = false;
+  cycle[0] = pfedge->dest;
+  j = pfedge->dest;
+
+  for (i = 1; i < fnum_vertices; i++)
+    {
+      j = pi[j];
+      cycle[i] = j;
+      for (k = 0; k < i; k++)
+	{
+	  if (cycle[k] == j)
+	    {
+	      /* cycle[k] -> ... -> cycle[i].  */
+	      cycle_start = k;
+	      cycle_end = i;
+	      found_cycle = true;
+	      break;
+	    }
+	}
+      if (found_cycle)
+	break;
+    }
+
+  gcc_assert (cycle[cycle_start] == cycle[cycle_end]);
+  if (dump_file)
+    fprintf (dump_file, "\nNegative cycle length is %d:\n",
+	     cycle_end - cycle_start);
+
+  sum_cost = 0;
+  cycle_flow = CAP_INFINITY;
+  for (k = cycle_start; k < cycle_end; k++)
+    {
+      pfedge = find_fixup_edge (fixup_graph, cycle[k + 1], cycle[k]);
+      cycle_flow = MIN (cycle_flow, pfedge->rflow);
+      sum_cost += pfedge->cost;
+      if (dump_file)
+	fprintf (dump_file, "%d ", cycle[k]);
+    }
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "%d", cycle[k]);
+      fprintf (dump_file,
+	       ": (" HOST_WIDEST_INT_PRINT_DEC ", " HOST_WIDEST_INT_PRINT_DEC
+	       ")\n", sum_cost, cycle_flow);
+      fprintf (dump_file,
+	       "Augment cycle with " HOST_WIDEST_INT_PRINT_DEC "\n",
+	       cycle_flow);
+    }
+
+  for (k = cycle_start; k < cycle_end; k++)
+    {
+      pfedge = find_fixup_edge (fixup_graph, cycle[k + 1], cycle[k]);
+      r_pfedge = find_fixup_edge (fixup_graph, cycle[k], cycle[k + 1]);
+      pfedge->rflow -= cycle_flow;
+      if (pfedge->type)
+	pfedge->flow += cycle_flow;
+      r_pfedge->rflow += cycle_flow;
+      if (r_pfedge->type)
+	r_pfedge->flow -= cycle_flow;
+    }
+
+  return true;
+}
+
+
+/* Computes the residual flow for FIXUP_GRAPH by setting the rflow field of
+   the edges. ENTRY and EXIT vertices should not be considered.  */
+
+static void
+compute_residual_flow (fixup_graph_type *fixup_graph)
+{
+  int i;
+  int fnum_edges;
+  fixup_edge_p fedge_list, pfedge;
+
+  gcc_assert (fixup_graph);
+
+  if (dump_file)
+    fputs ("\ncompute_residual_flow():\n", dump_file);
+
+  fnum_edges = fixup_graph->num_edges;
+  fedge_list = fixup_graph->edge_list;
+
+  for (i = 0; i < fnum_edges; i++)
+    {
+      pfedge = fedge_list + i;
+      pfedge->rflow = pfedge->max_capacity - pfedge->flow;
+      pfedge->is_rflow_valid = true;
+      add_rfixup_edge (fixup_graph, pfedge->dest, pfedge->src, pfedge->flow,
+		       -pfedge->cost);
+    }
+}
+
+
+/* Uses Edmonds-Karp algorithm - BFS to find augmenting path from SOURCE to
+   SINK. The fields in the edge vector in the FIXUP_GRAPH are not modified by
+   this routine. The vector bb_pred in the AUGMENTING_PATH structure is updated
+   to reflect the path found.
+   Returns: 0 if no augmenting path is found, 1 otherwise.  */
+
+static int
+find_augmenting_path (fixup_graph_type *fixup_graph,
+		      augmenting_path_type *augmenting_path, int source,
+		      int sink)
+{
+  int u = 0;
+  int i;
+  fixup_vertex_p fvertex_list, pfvertex;
+  fixup_edge_p pfedge;
+  int *bb_pred, *is_visited;
+  queue_type *queue_list;
+
+  gcc_assert (augmenting_path);
+  bb_pred = augmenting_path->bb_pred;
+  gcc_assert (bb_pred);
+  is_visited = augmenting_path->is_visited;
+  gcc_assert (is_visited);
+  queue_list = &(augmenting_path->queue_list);
+
+  gcc_assert (fixup_graph);
+
+  fvertex_list = fixup_graph->vertex_list;
+
+  for (u = 0; u < fixup_graph->num_vertices; u++)
+    is_visited[u] = 0;
+
+  init_queue (queue_list);
+  enqueue (queue_list, source);
+  bb_pred[source] = -1;
+
+  while (!is_empty (queue_list))
+    {
+      u = dequeue (queue_list);
+      is_visited[u] = 1;
+      pfvertex = fvertex_list + u;
+      for (i = 0; VEC_iterate (fixup_edge_p, pfvertex->succ_edges, i, pfedge);
+	   i++)
+	{
+	  int dest = pfedge->dest;
+	  if ((pfedge->rflow > 0) && (is_visited[dest] == 0))
+	    {
+	      enqueue (queue_list, dest);
+	      bb_pred[dest] = u;
+	      is_visited[dest] = 1;
+	      if (dest == sink)
+		return 1;
+	    }
+	}
+    }
+
+  return 0;
+}
+
+
+/* Routine to find the maximal flow:
+   Algorithm:
+   1. Initialize flow to 0
+   2. Find an augmenting path form source to sink.
+   3. Send flow equal to the path's residual capacity along the edges of this path.
+   4. Repeat steps 2 and 3 until no new augmenting path is found.
+   
+Parameters:
+SOURCE: index of source vertex (input)
+SINK: index of sink vertex    (input)
+FIXUP_GRAPH: adjacency matrix representing the graph. The flow of the edges will be
+             set to have a valid maximal flow by this routine. (input)
+Return: Maximum flow possible.  */
+
+static gcov_type
+find_max_flow (fixup_graph_type *fixup_graph, int source, int sink)
+{
+  int fnum_edges;
+  augmenting_path_type augmenting_path;
+  int *bb_pred;
+  gcov_type max_flow = 0;
+  int i, u;
+  fixup_edge_p fedge_list, pfedge, r_pfedge;
+
+  gcc_assert (fixup_graph);
+
+  fnum_edges = fixup_graph->num_edges;
+  fedge_list = fixup_graph->edge_list;
+
+  /* Initialize flow to 0.  */
+  for (i = 0; i < fnum_edges; i++)
+    {
+      pfedge = fedge_list + i;
+      pfedge->flow = 0;
+    }
+
+  compute_residual_flow (fixup_graph);
+
+  init_augmenting_path (&augmenting_path, fixup_graph->num_vertices);
+
+  bb_pred = augmenting_path.bb_pred;
+  while (find_augmenting_path (fixup_graph, &augmenting_path, source, sink))
+    {
+      /* Determine the amount by which we can increment the flow.  */
+      gcov_type increment = CAP_INFINITY;
+      for (u = sink; u != source; u = bb_pred[u])
+	{
+	  pfedge = find_fixup_edge (fixup_graph, bb_pred[u], u);
+	  increment = MIN (increment, pfedge->rflow);
+	}
+      max_flow += increment;
+
+      /* Now increment the flow. EXIT vertex index is 1.  */
+      for (u = sink; u != source; u = bb_pred[u])
+	{
+	  pfedge = find_fixup_edge (fixup_graph, bb_pred[u], u);
+	  r_pfedge = find_fixup_edge (fixup_graph, u, bb_pred[u]);
+	  if (pfedge->type)
+	    {
+	      /* forward edge.  */
+	      pfedge->flow += increment;
+	      pfedge->rflow -= increment;
+	      r_pfedge->rflow += increment;
+	    }
+	  else
+	    {
+	      /* backward edge.  */
+	      gcc_assert (r_pfedge->type);
+	      r_pfedge->rflow += increment;
+	      r_pfedge->flow -= increment;
+	      pfedge->rflow -= increment;
+	    }
+	}
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "\nDump augmenting path:\n");
+	  for (u = sink; u != source; u = bb_pred[u])
+	    {
+	      print_basic_block (dump_file, fixup_graph, u);
+	      fprintf (dump_file, "<-");
+	    }
+	  fprintf (dump_file,
+		   "ENTRY  (path_capacity=" HOST_WIDEST_INT_PRINT_DEC ")\n",
+		   increment);
+	  fprintf (dump_file,
+		   "Network flow is " HOST_WIDEST_INT_PRINT_DEC ".\n",
+		   max_flow);
+	}
+    }
+
+  free_augmenting_path (&augmenting_path);
+  if (dump_file)
+    dump_fixup_graph (dump_file, fixup_graph, "After find_max_flow()");
+  return max_flow;
+}
+
+
+/* Computes the corrected edge and basic block weights using FIXUP_GRAPH
+   after applying the find_minimum_cost_flow() routine.  */
+
+static void
+adjust_cfg_counts (fixup_graph_type *fixup_graph)
+{
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+  int i, j;
+  fixup_edge_p pfedge, pfedge_n;
+
+  gcc_assert (fixup_graph);
+
+  if (dump_file)
+    fprintf (dump_file, "\nadjust_cfg_counts():\n");
+
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb)
+    {
+      i = 2 * bb->index;
+
+      /* Fixup BB.  */
+      if (dump_file)
+        fprintf (dump_file,
+                 "BB%d: " HOST_WIDEST_INT_PRINT_DEC "", bb->index, bb->count);
+
+      pfedge = find_fixup_edge (fixup_graph, i, i + 1);
+      if (pfedge->flow)
+        {
+          bb->count += pfedge->flow;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, " + " HOST_WIDEST_INT_PRINT_DEC "(",
+	               pfedge->flow);
+	      print_edge (dump_file, fixup_graph, i, i + 1);
+	      fprintf (dump_file, ")");
+	    }
+        }
+
+      pfedge_n =
+        find_fixup_edge (fixup_graph, i + 1, pfedge->norm_vertex_index);
+      /* Deduct flow from normalized reverse edge.  */
+      if (pfedge->norm_vertex_index && pfedge_n->flow)
+        {
+          bb->count -= pfedge_n->flow;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, " - " HOST_WIDEST_INT_PRINT_DEC "(",
+		       pfedge_n->flow);
+	      print_edge (dump_file, fixup_graph, i + 1,
+			  pfedge->norm_vertex_index);
+	      fprintf (dump_file, ")");
+	    }
+        }
+      if (dump_file)
+        fprintf (dump_file, " = " HOST_WIDEST_INT_PRINT_DEC "\n", bb->count);
+
+      /* Fixup edge.  */
+      FOR_EACH_EDGE (e, ei, bb->succs)
+        {
+          /* Treat edges with ignore attribute set as if they don't exist.  */
+          if (EDGE_INFO (e) && EDGE_INFO (e)->ignore)
+	    continue;
+
+          j = 2 * e->dest->index;
+          if (dump_file)
+	    fprintf (dump_file, "%d->%d: " HOST_WIDEST_INT_PRINT_DEC "",
+		     bb->index, e->dest->index, e->count);
+
+          pfedge = find_fixup_edge (fixup_graph, i + 1, j);
+
+          if (bb->index != e->dest->index)
+	    {
+	      /* Non-self edge.  */
+	      if (pfedge->flow)
+	        {
+	          e->count += pfedge->flow;
+	          if (dump_file)
+		    {
+		      fprintf (dump_file, " + " HOST_WIDEST_INT_PRINT_DEC "(",
+			       pfedge->flow);
+		      print_edge (dump_file, fixup_graph, i + 1, j);
+		      fprintf (dump_file, ")");
+		    }
+	        }
+
+	      pfedge_n =
+	        find_fixup_edge (fixup_graph, j, pfedge->norm_vertex_index);
+	      /* Deduct flow from normalized reverse edge.  */
+	      if (pfedge->norm_vertex_index && pfedge_n->flow)
+	        {
+	          e->count -= pfedge_n->flow;
+	          if (dump_file)
+		    {
+		      fprintf (dump_file, " - " HOST_WIDEST_INT_PRINT_DEC "(",
+			       pfedge_n->flow);
+		      print_edge (dump_file, fixup_graph, j,
+			          pfedge->norm_vertex_index);
+		      fprintf (dump_file, ")");
+		    }
+	        }
+	    }
+          else
+	    {
+	      /* Handle self edges. Self edge is split with a normalization
+                 vertex. Here i=j.  */
+	      pfedge = find_fixup_edge (fixup_graph, j, i + 1);
+	      pfedge_n =
+	        find_fixup_edge (fixup_graph, i + 1, pfedge->norm_vertex_index);
+	      e->count += pfedge_n->flow;
+	      bb->count += pfedge_n->flow;
+	      if (dump_file)
+	        {
+	          fprintf (dump_file, "(self edge)");
+	          fprintf (dump_file, " + " HOST_WIDEST_INT_PRINT_DEC "(",
+		           pfedge_n->flow);
+	          print_edge (dump_file, fixup_graph, i + 1,
+			      pfedge->norm_vertex_index);
+	          fprintf (dump_file, ")");
+	        }
+	    }
+
+          if (bb->count)
+	    e->probability = REG_BR_PROB_BASE * e->count / bb->count;
+          if (dump_file)
+	    fprintf (dump_file, " = " HOST_WIDEST_INT_PRINT_DEC "\t(%.1f%%)\n",
+		     e->count, e->probability * 100.0 / REG_BR_PROB_BASE);
+        }
+    } 
+
+  ENTRY_BLOCK_PTR->count = sum_edge_counts (ENTRY_BLOCK_PTR->succs); 
+  EXIT_BLOCK_PTR->count = sum_edge_counts (EXIT_BLOCK_PTR->preds);
+
+  /* Compute edge probabilities.  */
+  FOR_ALL_BB (bb)
+    {
+      if (bb->count)
+        {
+          FOR_EACH_EDGE (e, ei, bb->succs)
+            e->probability = REG_BR_PROB_BASE * e->count / bb->count;
+        }
+      else
+        {
+          int total = 0;
+          FOR_EACH_EDGE (e, ei, bb->succs)
+            if (!(e->flags & (EDGE_COMPLEX | EDGE_FAKE)))
+              total++;
+          if (total)
+            {
+              FOR_EACH_EDGE (e, ei, bb->succs)
+                {
+                  if (!(e->flags & (EDGE_COMPLEX | EDGE_FAKE)))
+                    e->probability = REG_BR_PROB_BASE / total;
+                  else
+                    e->probability = 0;
+                }
+            }
+          else
+            {
+              total += EDGE_COUNT (bb->succs);
+              FOR_EACH_EDGE (e, ei, bb->succs)
+                  e->probability = REG_BR_PROB_BASE / total;
+            }
+        }
+    }
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "\nCheck %s() CFG flow conservation:\n",
+           lang_hooks.decl_printable_name (current_function_decl, 2));
+      FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb)
+        {
+          if ((bb->count != sum_edge_counts (bb->preds))
+               || (bb->count != sum_edge_counts (bb->succs)))
+            {
+              fprintf (dump_file,
+                       "BB%d(" HOST_WIDEST_INT_PRINT_DEC ")  **INVALID**: ",
+                       bb->index, bb->count);
+              fprintf (stderr,
+                       "******** BB%d(" HOST_WIDEST_INT_PRINT_DEC
+                       ")  **INVALID**: \n", bb->index, bb->count);
+              fprintf (dump_file, "in_edges=" HOST_WIDEST_INT_PRINT_DEC " ",
+                       sum_edge_counts (bb->preds));
+              fprintf (dump_file, "out_edges=" HOST_WIDEST_INT_PRINT_DEC "\n",
+                       sum_edge_counts (bb->succs));
+            }
+         }
+    }
+}
+
+
+/* Implements the negative cycle canceling algorithm to compute a minimum cost
+   flow.
+Algorithm:
+1. Find maximal flow.
+2. Form residual network
+3. Repeat:
+  While G contains a negative cost cycle C, reverse the flow on the found cycle
+  by the minimum residual capacity in that cycle.
+4. Form the minimal cost flow
+  f(u,v) = rf(v, u)
+Input:
+  FIXUP_GRAPH - Initial fixup graph.
+  The flow field is modified to represent the minimum cost flow.  */
+
+static void
+find_minimum_cost_flow (fixup_graph_type *fixup_graph)
+{
+  /* Holds the index of predecessor in path.  */
+  int *pred;
+  /* Used to hold the minimum cost cycle.  */
+  int *cycle;
+  /* Used to record the number of iterations of cancel_negative_cycle.  */
+  int iteration;
+  /* Vector d[i] holds the minimum cost of path from i to sink.  */
+  gcov_type *d;
+  int fnum_vertices;
+  int new_exit_index;
+  int new_entry_index;
+
+  gcc_assert (fixup_graph);
+  fnum_vertices = fixup_graph->num_vertices;
+  new_exit_index = fixup_graph->new_exit_index;
+  new_entry_index = fixup_graph->new_entry_index;
+
+  find_max_flow (fixup_graph, new_entry_index, new_exit_index);
+
+  /* Initialize the structures for find_negative_cycle().  */
+  pred = (int *) xcalloc (fnum_vertices, sizeof (int));
+  d = (gcov_type *) xcalloc (fnum_vertices, sizeof (gcov_type));
+  cycle = (int *) xcalloc (fnum_vertices, sizeof (int));
+
+  /* Repeatedly find and cancel negative cost cycles, until
+     no more negative cycles exist. This also updates the flow field
+     to represent the minimum cost flow so far.  */
+  iteration = 0;
+  while (cancel_negative_cycle (fixup_graph, pred, d, cycle))
+    {
+      iteration++;
+      if (iteration > MAX_ITER (fixup_graph->num_vertices,
+                                fixup_graph->num_edges))
+        break;
+    }
+
+  if (dump_file)
+    dump_fixup_graph (dump_file, fixup_graph,
+		      "After find_minimum_cost_flow()");
+
+  /* Cleanup structures.  */
+  free (pred);
+  free (d);
+  free (cycle);
+}
+
+
+/* Compute the sum of the edge counts in TO_EDGES.  */
+
+gcov_type
+sum_edge_counts (VEC (edge, gc) *to_edges)
+{
+  gcov_type sum = 0;
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, to_edges)
+    {
+      if (EDGE_INFO (e) && EDGE_INFO (e)->ignore)
+        continue;
+      sum += e->count;
+    }
+  return sum;
+}
+
+
+/* Main routine. Smoothes the intial assigned basic block and edge counts using
+   a minimum cost flow algorithm, to ensure that the flow consistency rule is
+   obeyed: sum of outgoing edges = sum of incoming edges for each basic
+   block.  */
+
+void
+mcf_smooth_cfg (void)
+{
+  fixup_graph_type fixup_graph;
+  memset (&fixup_graph, 0, sizeof (fixup_graph));
+  create_fixup_graph (&fixup_graph);
+  find_minimum_cost_flow (&fixup_graph);
+  adjust_cfg_counts (&fixup_graph);
+  delete_fixup_graph (&fixup_graph);
+}
Index: ChangeLog
===================================================================
--- ChangeLog	(revision 138218)
+++ ChangeLog	(working copy)
@@ -1,3 +1,23 @@
+2008-07-29 Paul Yuan  <yingbo.com@gmail.com>
+	   Vinodha Ramasamy <vinodha@google.com>
+	
+	* cgraph.c (cgraph_edge): Handle inconsistent counts when setting
+	count_scale.
+        * value-prof.c (check_counter): Fix the counter if 
+        flag_profile_correction is true.
+        (tree_divmod_fixed_value_transform, tree_mod_pow2_value_transform,
+        tree_mod_subtract_transform):
+        Follow check_counter parameter change.
+	* common.opt (fprofile-correction): New option.
+        * mcf.c: New file.
+        * profile.c (edge_info, EDGE_INFO): Moved to new file profile.h.
+        (sum_edge_counts, is_edge_inconsistent, correct_negative_edge_counts,
+	is_inconsistent, set_bb_counts, read_profile_edge_counts): New
+	functions.
+	(compute_branch_probabilities): Refactored. Invokes mcf_smooth_cfg if
+	flag_profile_correction is set.
+
+=======
 2008-07-28  Richard Guenther  <rguenther@suse.de>
 
 	PR tree-optimization/36957
Index: profile.c
===================================================================
--- profile.c	(revision 138218)
+++ profile.c	(working copy)
@@ -69,21 +69,11 @@ along with GCC; see the file COPYING3.  
 #include "cfgloop.h"
 #include "tree-pass.h"
 
+#include "profile.h"
+
 /* Hooks for profiling.  */
 static struct profile_hooks* profile_hooks;
 
-/* Additional information about the edges we need.  */
-struct edge_info {
-  unsigned int count_valid : 1;
-
-  /* Is on the spanning tree.  */
-  unsigned int on_tree : 1;
-
-  /* Pretend this edge does not exist (it is abnormal and we've
-     inserted a fake to compensate).  */
-  unsigned int ignore : 1;
-};
-
 struct bb_info {
   unsigned int count_valid : 1;
 
@@ -92,7 +82,6 @@ struct bb_info {
   gcov_type pred_count;
 };
 
-#define EDGE_INFO(e)  ((struct edge_info *) (e)->aux)
 #define BB_INFO(b)  ((struct bb_info *) (b)->aux)
 
 
@@ -124,7 +113,6 @@ static gcov_type * get_exec_counts (void
 static basic_block find_group (basic_block);
 static void union_groups (basic_block, basic_block);
 
-
 /* Add edge instrumentation code to the entire insn chain.
 
    F is the first insn of the chain.
@@ -278,64 +266,84 @@ get_exec_counts (void)
 
   return counts;
 }
-
 
-/* Compute the branch probabilities for the various branches.
-   Annotate them accordingly.  */
+
+static bool
+is_edge_inconsistent (VEC(edge,gc) *edges)
+{
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, edges)
+    {
+      if (!EDGE_INFO (e)->ignore)
+        {
+          if (e->count < 0)
+            return true;
+        }
+    }
+  return false;
+}
 
 static void
-compute_branch_probabilities (void)
+correct_negative_edge_counts (void)
 {
   basic_block bb;
-  int i;
-  int num_edges = 0;
-  int changes;
-  int passes;
-  int hist_br_prob[20];
-  int num_never_executed;
-  int num_branches;
-  gcov_type *exec_counts = get_exec_counts ();
-  int exec_counts_pos = 0;
+  edge e;
+  edge_iterator ei;
 
-  /* Very simple sanity checks so we catch bugs in our profiling code.  */
-  if (profile_info)
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
     {
-      if (profile_info->run_max * profile_info->runs < profile_info->sum_max)
-	{
-	  error ("corrupted profile info: run_max * runs < sum_max");
-	  exec_counts = NULL;
-	}
+      FOR_EACH_EDGE (e, ei, bb->succs)
+        {
+           if (e->count < 0)
+             e->count = 0;
+        }
+    }
+}
 
-      if (profile_info->sum_all < profile_info->sum_max)
-	{
-	  error ("corrupted profile info: sum_all is smaller than sum_max");
-	  exec_counts = NULL;
-	}
+/* Check consistency.
+   Return true if inconsistency is found.  */
+static bool
+is_inconsistent (void)
+{
+  basic_block bb;
+  FOR_EACH_BB (bb)
+    {
+      if (is_edge_inconsistent (bb->preds))
+        return true;
+      if (is_edge_inconsistent (bb->succs))
+        return true;
+      if ( bb->count != sum_edge_counts (bb->preds)
+         || (bb->count != sum_edge_counts (bb->succs) &&
+             !(find_edge (bb, EXIT_BLOCK_PTR) != NULL &&
+               block_ends_with_call_p (bb))))
+        return true;
     }
 
-  /* Attach extra info block to each bb.  */
+  return false;
+}
 
-  alloc_aux_for_blocks (sizeof (struct bb_info));
+/* Set each basic block count to the sum of its outgoing edge counts */
+static void
+set_bb_counts (void)
+{
+  basic_block bb;
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
     {
-      edge e;
-      edge_iterator ei;
-
-      FOR_EACH_EDGE (e, ei, bb->succs)
-	if (!EDGE_INFO (e)->ignore)
-	  BB_INFO (bb)->succ_count++;
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	if (!EDGE_INFO (e)->ignore)
-	  BB_INFO (bb)->pred_count++;
+      bb->count = sum_edge_counts (bb->succs);
+      gcc_assert (bb->count >= 0);
     }
+}
 
-  /* Avoid predicting entry on exit nodes.  */
-  BB_INFO (EXIT_BLOCK_PTR)->succ_count = 2;
-  BB_INFO (ENTRY_BLOCK_PTR)->pred_count = 2;
-
+/* Reads profile data and returns total number of edge counts read */
+static int
+read_profile_edge_counts (gcov_type *exec_counts)
+{
+  basic_block bb;
+  int num_edges = 0;
+  int exec_counts_pos = 0;
   /* For each edge not on the spanning tree, set its execution count from
      the .da file.  */
-
   /* The first count in the .da file is the number of times that the function
      was entered.  This is the exec_count for block zero.  */
 
@@ -373,6 +381,63 @@ compute_branch_probabilities (void)
 	  }
     }
 
+    return num_edges;
+}
+
+/* Compute the branch probabilities for the various branches.
+   Annotate them accordingly.  */
+
+static void
+compute_branch_probabilities (void)
+{
+  basic_block bb;
+  int i;
+  int num_edges = 0;
+  int changes;
+  int passes;
+  int hist_br_prob[20];
+  int num_never_executed;
+  int num_branches;
+  gcov_type *exec_counts = get_exec_counts ();
+  int inconsistent = 0;
+
+  /* Very simple sanity checks so we catch bugs in our profiling code.  */
+  if (profile_info)
+    {
+      if (profile_info->run_max * profile_info->runs < profile_info->sum_max)
+        {
+          error ("corrupted profile info: run_max * runs < sum_max");
+          exec_counts = NULL;
+        }
+
+      if (profile_info->sum_all < profile_info->sum_max)
+        {
+          error ("corrupted profile info: sum_all is smaller than sum_max");
+          exec_counts = NULL;
+        }
+    }
+
+  /* Attach extra info block to each bb.  */
+  alloc_aux_for_blocks (sizeof (struct bb_info));
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
+    {
+      edge e;
+      edge_iterator ei;
+
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (!EDGE_INFO (e)->ignore)
+	  BB_INFO (bb)->succ_count++;
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (!EDGE_INFO (e)->ignore)
+	  BB_INFO (bb)->pred_count++;
+    }
+
+  /* Avoid predicting entry on exit nodes.  */
+  BB_INFO (EXIT_BLOCK_PTR)->succ_count = 2;
+  BB_INFO (ENTRY_BLOCK_PTR)->pred_count = 2;
+
+  num_edges = read_profile_edge_counts (exec_counts);
+
   if (dump_file)
     fprintf (dump_file, "\n%d edge counts read\n", num_edges);
 
@@ -502,6 +567,31 @@ compute_branch_probabilities (void)
       gcc_assert (!BB_INFO (bb)->succ_count && !BB_INFO (bb)->pred_count);
     }
 
+  /* Check for inconsistent basic block counts */
+  inconsistent = is_inconsistent ();
+
+  if (inconsistent)
+   {
+     if (flag_profile_correction)
+       {
+         /* Inconsistency detected. Make it flow-consistent. */
+         static int informed = 0;
+         if (informed == 0)
+           {
+             informed = 1;
+             inform ("correcting inconsistent profile data");
+           }
+         correct_negative_edge_counts ();
+         /* Set bb counts to the sum of the outgoing edge counts */
+         set_bb_counts ();
+         if (dump_file)
+           fprintf (dump_file, "\nCalling mcf_smooth_cfg\n");
+         mcf_smooth_cfg ();
+       }
+     else
+       error ("corrupted profile info: profile data is not flow-consistent");
+   }
+
   /* For every edge, calculate its branch probability and add a reg_note
      to the branch insn to indicate this.  */
 
Index: profile.h
===================================================================
--- profile.h	(revision 0)
+++ profile.h	(revision 0)
@@ -0,0 +1,47 @@
+/* Header file for minimum-cost maximal flow routines used to smooth basic
+   block and edge frequency counts.
+   Copyright (C) 2008
+   Free Software Foundation, Inc.
+   Contributed by Paul Yuan (yingbo.com@gmail.com)
+       and Vinodha Ramasamy (vinodha@google.com).
+
+This file is part of GCC.
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef PROFILE_H
+#define PROFILE_H
+
+/* Additional information about edges. */
+struct edge_info
+{
+  unsigned int count_valid:1;
+
+  /* Is on the spanning tree.  */
+  unsigned int on_tree:1;
+
+  /* Pretend this edge does not exist (it is abnormal and we've
+     inserted a fake to compensate).  */
+  unsigned int ignore:1;
+};
+
+#define EDGE_INFO(e)  ((struct edge_info *) (e)->aux)
+
+/* Smoothes the initial assigned basic block and edge counts using
+   a minimum cost flow algorithm. */
+extern void mcf_smooth_cfg (void);
+
+extern gcov_type sum_edge_counts (VEC (edge, gc) *edges);
+
+#endif /* PROFILE_H */
Index: common.opt
===================================================================
--- common.opt	(revision 138218)
+++ common.opt	(working copy)
@@ -817,6 +817,10 @@ Common Joined RejectNegative
 Set the top-level directory for storing the profile data.
 The default is 'pwd'.
 
+fprofile-correction
+Common Report Var(flag_profile_correction)
+Enable correction of flow inconsistent profile data input
+
 fprofile-generate
 Common
 Enable common options for generating profile info for profile feedback directed optimizations
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 138218)
+++ Makefile.in	(working copy)
@@ -1122,6 +1122,7 @@ OBJS-common = \
 	loop-unroll.o \
 	loop-unswitch.o \
 	lower-subreg.o \
+	mcf.o \
 	mode-switching.o \
 	modulo-sched.o \
 	omega.o \
@@ -2725,7 +2726,9 @@ var-tracking.o : var-tracking.c $(CONFIG
 profile.o : profile.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(TREE_H) $(FLAGS_H) output.h $(REGS_H) $(EXPR_H) $(FUNCTION_H) \
    $(TOPLEV_H) $(COVERAGE_H) $(TREE_FLOW_H) value-prof.h cfghooks.h \
-   $(CFGLOOP_H) $(TIMEVAR_H) tree-pass.h
+   $(CFGLOOP_H) $(TIMEVAR_H) tree-pass.h profile.h
+mcf.o : mcf.c profile.h $(CONFIG_H) $(SYSTEM_H) $(TM_H) coretypes.h \
+   $(BASIC_BLOCK_H) output.h langhooks.h $(GCOV_IO_H) $(TREE_H) 
 tree-profile.o : tree-profile.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
    $(TM_H) $(RTL_H) $(TREE_H) $(FLAGS_H) output.h $(REGS_H) $(EXPR_H) \
    $(FUNCTION_H) $(TOPLEV_H) $(COVERAGE_H) $(TREE_H) value-prof.h $(TREE_DUMP_H) \
@@ -3221,7 +3224,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/inp
   $(srcdir)/emit-rtl.c $(srcdir)/except.c $(srcdir)/explow.c $(srcdir)/expr.c \
   $(srcdir)/function.c $(srcdir)/except.h \
   $(srcdir)/gcse.c $(srcdir)/integrate.c $(srcdir)/lists.c $(srcdir)/optabs.c \
-  $(srcdir)/profile.c $(srcdir)/regclass.c \
+  $(srcdir)/profile.c $(srcdir)/regclass.c $(srcdir)/mcf.c \
   $(srcdir)/reg-stack.c $(srcdir)/cfglayout.c $(srcdir)/cfglayout.h \
   $(srcdir)/sdbout.c $(srcdir)/stor-layout.c \
   $(srcdir)/stringpool.c $(srcdir)/tree.c $(srcdir)/varasm.c \

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]