This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[rtlopt] Loop histograms


Hello,

this patch adds possibility to measure histograms of loop iterations
and use them to decide loop peeling. I am commiting it to rtlopt-branch.

Zdenek

Changelog:
	* Makefile.in (profile.o): Add hashtab.h, cfgloop.h and params.h
	dependency.
	(cfgloop.o): Add flags.h dependency.
	(loop-init.o): Add profile.h dependency.
	* basic_block.h (struct edge_def): Add loop_histogram field.
	(dfs_enumerate_from): Moved from cfgloop.h.
	* cfg.c (dump_edge_info): Dump info about histograms.
	* cfgcleanup.c (try_simplify_condjump, try_forward_edges, merge_blocks):
	Keep histograms alive.
	* cfglayout.c (cleanup_unconditional_jumps, cfg_layout_duplicate_bb):
	Keep histograms alive.
	* cfgloop.c: Include flags.h
	(flow_loop_dump): Dump histograms.
	(flow_loop_free): Free histogram.
	(redirect_edge_with_latch_update, make_forwarder_block): Keep histograms
	alive.
	(flow_loops_find): Modified.
	(verify_loop_structure): Removed argument flags (use loops->state instead).
	Verify loop histograms.
	(move_histograms_to_loops, copy_histogram, add_histogram, free_histogram):
	New functions.
	(loop_latch_edge, loop_preheader_edge): Declare parameter as const.
	* cfgloop.h (struct loop): Add histogram field.
	(struct loop_histogram): New.
	(LOOPS_HAVE_PREHEADERS, LOOPS_HAVE_SIMPLE_LATCHES,
	LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS, LOOPS_HAVE_HISTOGRAMS_ON_EDGES):
	New flags for loops->state.
	(struct loops): Add state field.
	(VLS_EXPECT_PREHEADERS, VLS_EXPECT_SIMPLE_LATCHES,
	VLS_EXPECT_MARKED_IRREDUCIBLE_LOOPS, VLS_FOR_LOOP): Removed.
	(dfs_enumerate_from): Declaration moved to basic_block.h.
	(loop_preheader_edge, loop_latch_edge, verify_loop_structure): Declaration
	changed.
	(move_histograms_to_loops, loop_histogram, add_histogram, free_histogram):
	Declare.
	(DLTHE_PROB_UPDATING, DLTHE_USE_HISTOGRAM_PROB, DLTHE_USE_WONT_EXIT):
	New flags.
	(DLTHE_FLAG_ALL): Removed.
	* cfgloopanal.c (mark_irreducible_loops): Set loops->state.
	* cfgloopmanip.c (duplicate_loop, duplicate_subloops, copy_loops_to):
	Update histograms.
	(scale_loop_histograms): New.
	(duplicate_loop_to_header_edge): Use histograms for updating counts.
	(create_preheader): Update loop information correctly.
	(create_preheaders, force_single_succ_latches): Set loops->state.
	(loop_split_edge_with): Keep histograms alive.
	* cfgrtl.c (merge_blocks_nomove): Keep histograms alive.
	* flags.h (flag_loop_histograms): Declare.
	* gcov-io.h (GCOV_TAG_LOOP_HISTOGRAMS, GCOV_SUMMARY_LENGTH): New.
	(struct function_info): Add n_loop_histogram_counters field.
	(struct gcov_info): Add histogram_counts and n_histogram_counts fields.
	* libgcc2.c (gcov_exit, __gcov_flush): Write out histograms.
	* loop-init.c: Include profile.h.
	(loop_optimizer_init): Move histograms to loops.
	* loop-unroll.c (unroll_and_peel_loops): Modified.
	(peel_loop_completely, unroll_loop_constant_iterations,
	unroll_loop_constant_iterations, unroll_loop_runtime_iterations,
	peel_loop_simple, unroll_loop_stupid): Update/use histograms.
	(unroll_or_peel_loop): Use histograms to decide peeling.
	* loop-unswitch.c (unswitch_loops, unswitch_loop): Modified.
	* params.def (PARAM_HISTOGRAM_PEEL_RATIO): New.
	* profile.c: Include cfgloop.h, params.h and hashtab.h.
	(struct function_list): Add histogram_counters field.
	(loop_histograms_label): New.
	(gen_loop_profiler, instrument_loops, compute_loop_histograms,
	htab_counts_index_hash, htab_counts_index_eq, htab_counts_index_del,
	cleanup_counts_index, index_counts_file, get_histogram_counts): New.
	(get_exec_counts, instrument_edges, init_branch_prob,
	create_profiler): Modified.
	(branch_prob): Read loop histograms and add code to generate them.
	* profile.h (struct profile_info): Add count_histogram_counters,
	count_histogram_counters_now and have_loop_histograms fields.
	* toplev.c (flag_loop_histograms): New.
	* tracer.c (find_trace): Prevent loop peeling.
	* doc/invoke.texi: Document -floop-histograms.

Index: Makefile.in
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.937.2.5
diff -c -3 -p -r1.937.2.5 Makefile.in
*** Makefile.in	30 Oct 2002 22:08:10 -0000	1.937.2.5
--- Makefile.in	8 Nov 2002 12:14:13 -0000
*************** conflict.o : conflict.c $(CONFIG_H) $(SY
*** 1548,1554 ****
  profile.o : profile.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(TREE_H) flags.h \
     insn-config.h output.h $(REGS_H) $(EXPR_H) function.h \
     gcov-io.h gcov-iov.h toplev.h $(GGC_H) hard-reg-set.h $(BASIC_BLOCK_H) \
!    $(TARGET_H) langhooks.h profile.h libfuncs.h gt-profile.h
  loop.o : loop.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) flags.h $(LOOP_H) \
     insn-config.h $(REGS_H) hard-reg-set.h $(RECOG_H) $(EXPR_H) \
     real.h $(PREDICT_H) $(BASIC_BLOCK_H) function.h \
--- 1548,1555 ----
  profile.o : profile.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(TREE_H) flags.h \
     insn-config.h output.h $(REGS_H) $(EXPR_H) function.h \
     gcov-io.h gcov-iov.h toplev.h $(GGC_H) hard-reg-set.h $(BASIC_BLOCK_H) \
!    $(TARGET_H) langhooks.h profile.h libfuncs.h gt-profile.h cfgloop.h \
!    params.h $(HASHTAB_H)
  loop.o : loop.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) flags.h $(LOOP_H) \
     insn-config.h $(REGS_H) hard-reg-set.h $(RECOG_H) $(EXPR_H) \
     real.h $(PREDICT_H) $(BASIC_BLOCK_H) function.h \
*************** cfgcleanup.o : cfgcleanup.c $(CONFIG_H) 
*** 1576,1588 ****
     $(BASIC_BLOCK_H) hard-reg-set.h output.h flags.h $(RECOG_H) toplev.h \
     $(GGC_H) insn-config.h cselib.h $(TARGET_H) $(TM_P_H)
  cfgloop.o : cfgloop.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
!    $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h
  cfgloopanal.o : cfgloopanal.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h $(EXPR_H)
  cfgloopmanip.o : cfgloopmanip.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h output.h
  loop-init.o : loop-init.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
!    $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h 
  loop-unswitch.o : loop-unswitch.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h params.h \
     output.h $(expr.h)
--- 1577,1589 ----
     $(BASIC_BLOCK_H) hard-reg-set.h output.h flags.h $(RECOG_H) toplev.h \
     $(GGC_H) insn-config.h cselib.h $(TARGET_H) $(TM_P_H)
  cfgloop.o : cfgloop.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
!    $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h flags.h
  cfgloopanal.o : cfgloopanal.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h $(EXPR_H)
  cfgloopmanip.o : cfgloopmanip.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h output.h
  loop-init.o : loop-init.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
!    $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h profile.h
  loop-unswitch.o : loop-unswitch.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
     $(BASIC_BLOCK_H) hard-reg-set.h cfgloop.h cfglayout.h params.h \
     output.h $(expr.h)
Index: basic-block.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/basic-block.h,v
retrieving revision 1.158.2.2
diff -c -3 -p -r1.158.2.2 basic-block.h
*** basic-block.h	19 Oct 2002 15:20:23 -0000	1.158.2.2
--- basic-block.h	8 Nov 2002 12:14:13 -0000
*************** typedef struct edge_def {
*** 130,135 ****
--- 130,138 ----
    /* Instructions queued on the edge.  */
    rtx insns;
  
+   /* Histogram of loop whose latch is this edge.  */
+   struct loop_histogram *loop_histogram;
+ 
    /* Auxiliary info specific to a pass.  */
    void *aux;
  
*************** typedef struct edge_def {
*** 155,160 ****
--- 158,164 ----
  /* Declared in cfgloop.h.  */
  struct loop;
  struct loops;
+ struct loop_histogram;
  
  /* A basic block is a sequence of instructions with only entry and
     only one exit.  If any one of the instructions are executed, they
*************** extern void tidy_fallthru_edges		PARAMS 
*** 364,369 ****
--- 368,376 ----
  extern void flow_reverse_top_sort_order_compute	PARAMS ((int *));
  extern int flow_depth_first_order_compute	PARAMS ((int *, int *));
  extern void flow_preorder_transversal_compute	PARAMS ((int *));
+ extern int dfs_enumerate_from		PARAMS ((basic_block, int,
+ 						bool (*)(basic_block, void *),
+ 						basic_block *, int, void *));
  extern void dump_edge_info		PARAMS ((FILE *, edge, int));
  extern void clear_edges			PARAMS ((void));
  extern void mark_critical_edges		PARAMS ((void));
Index: cfg.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cfg.c,v
retrieving revision 1.35
diff -c -3 -p -r1.35 cfg.c
*** cfg.c	10 Aug 2002 18:00:54 -0000	1.35
--- cfg.c	8 Nov 2002 12:14:13 -0000
*************** dump_edge_info (file, e, do_succ)
*** 633,638 ****
--- 633,639 ----
       int do_succ;
  {
    basic_block side = (do_succ ? e->dest : e->src);
+   int comma = 0;
  
    if (side == ENTRY_BLOCK_PTR)
      fputs (" ENTRY", file);
*************** dump_edge_info (file, e, do_succ)
*** 650,660 ****
        fprintf (file, HOST_WIDEST_INT_PRINT_DEC, e->count);
      }
  
!   if (e->flags)
      {
        static const char * const bitnames[]
  	= {"fallthru", "ab", "abcall", "eh", "fake", "dfs_back", "can_fallthru"};
-       int comma = 0;
        int i, flags = e->flags;
  
        fputs (" (", file);
--- 651,660 ----
        fprintf (file, HOST_WIDEST_INT_PRINT_DEC, e->count);
      }
  
!   if (e->flags || e->loop_histogram)
      {
        static const char * const bitnames[]
  	= {"fallthru", "ab", "abcall", "eh", "fake", "dfs_back", "can_fallthru"};
        int i, flags = e->flags;
  
        fputs (" (", file);
*************** dump_edge_info (file, e, do_succ)
*** 671,676 ****
--- 671,683 ----
  	      fprintf (file, "%d", i);
  	    comma = 1;
  	  }
+   
+       if (e->loop_histogram)
+ 	{
+ 	  if (comma)
+ 	    fputc (',', file);
+ 	  fputs ("carries loop histogram", file);
+ 	}
  
        fputc (')', file);
      }
Index: cfgcleanup.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cfgcleanup.c,v
retrieving revision 1.63.2.1
diff -c -3 -p -r1.63.2.1 cfgcleanup.c
*** cfgcleanup.c	16 Oct 2002 16:07:08 -0000	1.63.2.1
--- cfgcleanup.c	8 Nov 2002 12:14:13 -0000
*************** try_simplify_condjump (cbranch_block)
*** 124,129 ****
--- 124,130 ----
    basic_block jump_block, jump_dest_block, cbranch_dest_block;
    edge cbranch_jump_edge, cbranch_fallthru_edge;
    rtx cbranch_insn;
+   struct loop_histogram *histogram;
  
    /* Verify that there are exactly two successors.  */
    if (!cbranch_block->succ
*************** try_simplify_condjump (cbranch_block)
*** 168,179 ****
--- 169,182 ----
    /* Success.  Update the CFG to match.  Note that after this point
       the edge variable names appear backwards; the redirection is done
       this way to preserve edge profile data.  */
+   histogram = jump_block->succ->loop_histogram;
    cbranch_jump_edge = redirect_edge_succ_nodup (cbranch_jump_edge,
  						cbranch_dest_block);
    cbranch_fallthru_edge = redirect_edge_succ_nodup (cbranch_fallthru_edge,
  						    jump_dest_block);
    cbranch_jump_edge->flags |= EDGE_FALLTHRU;
    cbranch_fallthru_edge->flags &= ~EDGE_FALLTHRU;
+   cbranch_fallthru_edge->loop_histogram = histogram;
    update_br_prob_note (cbranch_block);
  
    /* Delete the block with the unconditional jump, and clean up the mess.  */
*************** try_forward_edges (mode, b)
*** 448,453 ****
--- 451,460 ----
  	      /* Bypass trivial infinite loops.  */
  	      if (target == target->succ->dest)
  		counter = n_basic_blocks;
+ 	      /* Do not try to thread over loop latches when we have loop histograms
+     		 to keep (to avoid creating loops with shared headers).  */
+ 	      if (target->succ->loop_histogram)
+ 		break;
  	      new_target = target->succ->dest;
  	    }
  
*************** try_forward_edges (mode, b)
*** 456,463 ****
--- 463,475 ----
  	  else if (mode & CLEANUP_THREADING)
  	    {
  	      edge t = thread_jump (mode, e, target);
+ 
  	      if (t)
  		{
+ 		  /* Do not try to thread over loop latches when we have loop histograms
+ 		     to keep (to avoid creating loops with shared headers).  */
+ 		  if (t->loop_histogram)
+ 		    break;
  		  if (!threaded_edges)
  		    threaded_edges = xmalloc (sizeof (*threaded_edges)
  					      * n_basic_blocks);
*************** try_forward_edges (mode, b)
*** 546,566 ****
  	  int edge_probability = e->probability;
  	  int edge_frequency;
  	  int n = 0;
  
  	  /* Don't force if target is exit block.  */
  	  if (threaded && target != EXIT_BLOCK_PTR)
  	    {
! 	      notice_new_block (redirect_edge_and_branch_force (e, target));
  	      if (rtl_dump_file)
  		fprintf (rtl_dump_file, "Conditionals threaded.\n");
  	    }
! 	  else if (!redirect_edge_and_branch (e, target))
  	    {
! 	      if (rtl_dump_file)
! 		fprintf (rtl_dump_file,
! 			 "Forwarding edge %i->%i to %i failed.\n",
! 			 b->index, e->dest->index, target->index);
! 	      continue;
  	    }
  
  	  /* We successfully forwarded the edge.  Now update profile
--- 558,583 ----
  	  int edge_probability = e->probability;
  	  int edge_frequency;
  	  int n = 0;
+ 	  basic_block jump;
  
  	  /* Don't force if target is exit block.  */
  	  if (threaded && target != EXIT_BLOCK_PTR)
  	    {
! 	      jump = redirect_edge_and_branch_force (e, target);
! 	      notice_new_block (jump);
  	      if (rtl_dump_file)
  		fprintf (rtl_dump_file, "Conditionals threaded.\n");
  	    }
! 	  else
  	    {
! 	      if (!redirect_edge_and_branch (e, target))
! 		{
! 		  if (rtl_dump_file)
! 		    fprintf (rtl_dump_file,
! 			     "Forwarding edge %i->%i to %i failed.\n",
! 			     b->index, e->dest->index, target->index);
! 		  continue;
! 		}
  	    }
  
  	  /* We successfully forwarded the edge.  Now update profile
*************** merge_blocks (e, b, c, mode)
*** 793,798 ****
--- 810,820 ----
       basic_block b, c;
       int mode;
  {
+   /* Forbid creation of shared headers in case we have loop histograms attached.  */
+   if (e->loop_histogram
+       && (!e->src->pred || e->src->pred->pred_next || e->src->pred->loop_histogram))
+     return false;
+ 
    /* If C has a tail recursion label, do not merge.  There is no
       edge recorded from the call_placeholder back to this label, as
       that would make optimize_sibling_and_tail_recursive_calls more
Index: cfglayout.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cfglayout.c,v
retrieving revision 1.21.4.2
diff -c -3 -p -r1.21.4.2 cfglayout.c
*** cfglayout.c	19 Oct 2002 15:20:24 -0000	1.21.4.2
--- cfglayout.c	8 Nov 2002 12:14:13 -0000
*************** cleanup_unconditional_jumps (loops)
*** 620,625 ****
--- 620,627 ----
       struct loops *loops;
  {
    basic_block bb;
+   struct loop_histogram *histogram;
+   edge e;
  
    FOR_EACH_BB (bb)
      {
*************** cleanup_unconditional_jumps (loops)
*** 639,644 ****
--- 641,647 ----
  		fprintf (rtl_dump_file, "Removing forwarder BB %i\n",
  			 bb->index);
  
+ 	      histogram = bb->succ->loop_histogram;
  	      if (loops)
  		{
  		  /* bb cannot be loop header, as it only has one entry
*************** cleanup_unconditional_jumps (loops)
*** 658,664 ****
  		  delete_from_dominance_info (loops->cfg.dom, bb);
  		}
  
! 	      redirect_edge_succ_nodup (bb->pred, bb->succ->dest);
  	      flow_delete_block (bb);
  	      bb = prev;
  	    }
--- 661,668 ----
  		  delete_from_dominance_info (loops->cfg.dom, bb);
  		}
  
! 	      e = redirect_edge_succ_nodup (bb->pred, bb->succ->dest);
! 	      e->loop_histogram = histogram;
  	      flow_delete_block (bb);
  	      bb = prev;
  	    }
*************** cfg_layout_duplicate_bb (bb, e)
*** 950,955 ****
--- 954,965 ----
      {
        n = make_edge (new_bb, s->dest, s->flags);
        n->probability = s->probability;
+       /* Duplicate histograms.  */
+       if (s->loop_histogram)
+ 	{
+ 	  n->loop_histogram = copy_histogram (s->loop_histogram, new_count * 10000 / bb->count);
+ 	  add_histogram (s->loop_histogram, n->loop_histogram, -10000);
+ 	}
        if (new_count)
  	/* Take care for overflows!  */
  	n->count = s->count * (new_count * 10000 / bb->count) / 10000;
Index: cfgloop.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cfgloop.c,v
retrieving revision 1.15.4.3
diff -c -3 -p -r1.15.4.3 cfgloop.c
*** cfgloop.c	30 Oct 2002 22:08:12 -0000	1.15.4.3
--- cfgloop.c	8 Nov 2002 12:14:13 -0000
*************** Software Foundation, 59 Temple Place - S
*** 25,30 ****
--- 25,31 ----
  #include "basic-block.h"
  #include "toplev.h"
  #include "cfgloop.h"
+ #include "flags.h"
  
  /* Ratio of frequencies of edges so that one of more latch edges is
     considered to belong to inner loop with same header.  */
*************** flow_loop_dump (loop, file, loop_dump_au
*** 113,119 ****
--- 114,122 ----
       int verbose;
  {
    basic_block *bbs;
+   edge e;
    int i;
+   struct loop_histogram *histogram = NULL;
  
    if (! loop || ! loop->header)
      return;
*************** flow_loop_dump (loop, file, loop_dump_au
*** 128,133 ****
--- 131,160 ----
  	   loop->depth, loop->level,
  	   (long) (loop->outer ? loop->outer->num : -1));
  
+   if (loop->num)
+     {
+       e = loop_latch_edge (loop);
+       if (e->loop_histogram)
+ 	histogram = e->loop_histogram;
+       if (loop->histogram)
+ 	histogram = loop->histogram;
+     }
+   if (histogram)
+     {
+       fprintf (file, ";; histogram of iterations:");
+       for (i = 0; i < histogram->steps; i++)
+ 	{
+ 	  fprintf (file, " %d : ", i);
+ 	  fprintf (file, HOST_WIDEST_INT_PRINT_DEC,
+ 		   (HOST_WIDEST_INT) histogram->counts[i]);
+ 	  fprintf (file, ";");
+ 	}
+       fprintf (file, " more ");
+       fprintf (file, HOST_WIDEST_INT_PRINT_DEC,
+ 	       (HOST_WIDEST_INT) histogram->more);
+       fprintf (file, "\n");
+     }
+ 
    if (loop->pre_header_edges)
      flow_edge_list_print (";;  pre-header edges", loop->pre_header_edges,
  			  loop->num_pre_header_edges, file);
*************** flow_loop_free (loop)
*** 194,199 ****
--- 221,228 ----
      free (loop->exit_edges);
    if (loop->pred)
      free (loop->pred);
+   if (loop->histogram)
+     free_histogram (loop->histogram);
    free (loop);
  }
  
*************** redirect_edge_with_latch_update (e, to)
*** 584,589 ****
--- 613,620 ----
        alloc_aux_for_edge (jump->pred, sizeof (int));
        LATCH_EDGE (jump->succ) = LATCH_EDGE (e);
        LATCH_EDGE (jump->pred) = 0;
+       jump->succ->loop_histogram = e->loop_histogram;
+       e->loop_histogram = NULL;
      }
  }
  
*************** make_forwarder_block (bb, redirect_latch
*** 605,610 ****
--- 636,642 ----
    edge e, next_e, fallthru;
    basic_block dummy;
    rtx insn;
+   struct loop_histogram *histogram = NULL;
  
    insn = PREV_INSN (first_insn_after_basic_block_note (bb));
  
*************** make_forwarder_block (bb, redirect_latch
*** 620,625 ****
--- 652,676 ----
    HEADER_BLOCK (dummy) = 0;
    HEADER_BLOCK (bb) = 1;
  
+   /* Merge histograms.  */
+   if (conn_latch)
+     {
+       histogram = NULL;
+       for (e = dummy->pred; e; e = e->pred_next)
+ 	if (e->loop_histogram && e != except)
+ 	  {
+ 	    if (histogram)
+ 	      {
+ 		add_histogram (histogram, e->loop_histogram, REG_BR_PROB_BASE);
+ 		free_histogram (e->loop_histogram);
+ 	      }
+ 	    else
+ 	      histogram = e->loop_histogram;
+ 	    e->loop_histogram = NULL;
+ 	  }
+       fallthru->loop_histogram = histogram;
+     }
+ 
    /* Redirect back edges we want to keep.  */
    for (e = dummy->pred; e; e = next_e)
      {
*************** flow_loops_find (loops, flags)
*** 928,936 ****
        loops->cfg.dom = NULL;
        free_dominance_info (dom);
      }
  #ifdef ENABLE_CHECKING
    verify_flow_info ();
!   verify_loop_structure (loops, 0);
  #endif
  
    return loops->num;
--- 979,989 ----
        loops->cfg.dom = NULL;
        free_dominance_info (dom);
      }
+ 
+   loops->state = 0;
  #ifdef ENABLE_CHECKING
    verify_flow_info ();
!   verify_loop_structure (loops);
  #endif
  
    return loops->num;
*************** cancel_loop_tree (loops, loop)
*** 1119,1135 ****
       -- loop header have just single entry edge and single latch edge
       -- loop latches have only single successor that is header of their loop
       -- irreducible loops are correctly marked
    */
  void
! verify_loop_structure (loops, flags)
       struct loops *loops;
-      int flags;
  {
    int *sizes, i, j;
    sbitmap irreds;
    basic_block *bbs, bb;
    struct loop *loop;
    int err = 0;
  
    /* Check sizes.  */
    sizes = xcalloc (loops->num, sizeof (int));
--- 1172,1189 ----
       -- loop header have just single entry edge and single latch edge
       -- loop latches have only single successor that is header of their loop
       -- irreducible loops are correctly marked
+      -- loop histograms are in loop headers
    */
  void
! verify_loop_structure (loops)
       struct loops *loops;
  {
    int *sizes, i, j;
    sbitmap irreds;
    basic_block *bbs, bb;
    struct loop *loop;
    int err = 0;
+   edge e;
  
    /* Check sizes.  */
    sizes = xcalloc (loops->num, sizeof (int));
*************** verify_loop_structure (loops, flags)
*** 1179,1192 ****
        if (!loop)
  	continue;
  
!       if ((flags & VLS_EXPECT_PREHEADERS)
  	  && (!loop->header->pred->pred_next
  	      || loop->header->pred->pred_next->pred_next))
  	{
  	  error ("Loop %d's header does not have exactly 2 entries.", i);
  	  err = 1;
  	}
!       if (flags & VLS_EXPECT_SIMPLE_LATCHES)
  	{
  	  if (!loop->latch->succ
  	      || loop->latch->succ->succ_next)
--- 1233,1246 ----
        if (!loop)
  	continue;
  
!       if ((loops->state & LOOPS_HAVE_PREHEADERS)
  	  && (!loop->header->pred->pred_next
  	      || loop->header->pred->pred_next->pred_next))
  	{
  	  error ("Loop %d's header does not have exactly 2 entries.", i);
  	  err = 1;
  	}
!       if (loops->state & LOOPS_HAVE_SIMPLE_LATCHES)
  	{
  	  if (!loop->latch->succ
  	      || loop->latch->succ->succ_next)
*************** verify_loop_structure (loops, flags)
*** 1213,1219 ****
      }
  
    /* Check irreducible loops.  */
!   if (flags & VLS_EXPECT_MARKED_IRREDUCIBLE_LOOPS)
      {
        /* Record old info.  */
        irreds = sbitmap_alloc (last_basic_block);
--- 1267,1273 ----
      }
  
    /* Check irreducible loops.  */
!   if (loops->state & LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS)
      {
        /* Record old info.  */
        irreds = sbitmap_alloc (last_basic_block);
*************** verify_loop_structure (loops, flags)
*** 1245,1258 ****
        free (irreds);
      }
  
    if (err)
      abort ();
  }
  
  /* Returns latch edge of LOOP.  */
  edge
  loop_latch_edge (loop)
!      struct loop *loop;
  {
    edge e;
  
--- 1299,1401 ----
        free (irreds);
      }
  
+   /* Check loop histograms.  */
+   if (loops->state & LOOPS_HAVE_HISTOGRAMS_ON_EDGES)
+     {
+       FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb)
+ 	{
+ 	  for (e = bb->succ; e; e = e->succ_next)
+ 	    if (e->loop_histogram
+ 		&& (e->dest->loop_father->header != e->dest
+ 		    || e->dest->loop_father->latch != bb))
+ 	      {
+ 		error ("Lost histogram on edge from %d to %d.", e->src->index, e->dest->index);
+ 		err = 1;
+ 	      }
+ 	}
+     }
+ 
    if (err)
      abort ();
  }
  
+ /* Move histograms for LOOPS from latch edges to loop datastructure.  */
+ void
+ move_histograms_to_loops (loops)
+      struct loops *loops;
+ {
+   int i;
+ 
+   for (i = 1; i < loops->num; i++)
+     {
+       struct loop *loop = loops->parray[i];
+       edge e;
+ 
+       if (!loop)
+ 	continue;
+ 
+       e = loop_latch_edge (loop);
+ 
+       loop->histogram = e->loop_histogram;
+       e->loop_histogram = NULL;
+     }
+   loops->state &= ~LOOPS_HAVE_HISTOGRAMS_ON_EDGES;
+ }
+ 
+ /* Copies loop HISTOGRAM, scaling it by PROB.  */
+ struct loop_histogram *
+ copy_histogram (histogram, prob)
+      struct loop_histogram *histogram;
+      int prob;
+ {
+   int i, steps;
+   struct loop_histogram *retval;
+ 
+   retval = xmalloc (sizeof (struct loop_histogram));
+   steps = retval->steps = histogram->steps;
+   retval->counts = xmalloc (sizeof (gcov_type) * steps);
+   for (i = 0; i < steps; i++)
+     retval->counts[i] = histogram->counts[i] * prob /REG_BR_PROB_BASE;
+   retval->more = histogram->more * prob /REG_BR_PROB_BASE;
+ 
+   return retval;
+ }
+ 
+ /* Add HISTOGRAM to TARGET, scaling it by PROB.  */
+ void
+ add_histogram (target, histogram, prob)
+      struct loop_histogram *target;
+      struct loop_histogram *histogram;
+      int prob;
+ {
+   int i;
+ 
+   if (target->steps > histogram->steps)
+     {
+       for (i = histogram->steps; i < target->steps; i++)
+ 	target->more += target->counts[i];
+       target->steps = histogram->steps;
+     }
+   for (i = 0; i < target->steps; i++)
+     target->counts[i] += histogram->counts[i] * prob /REG_BR_PROB_BASE;
+   for (; i < histogram->steps; i++)
+     target->more += histogram->counts[i] * prob /REG_BR_PROB_BASE;
+   target->more += histogram->more * prob /REG_BR_PROB_BASE;
+ }
+ 
+ /* Free the HISTOGRAM.  */
+ void
+ free_histogram (histogram)
+      struct loop_histogram *histogram;
+ {
+   free (histogram->counts);
+   free (histogram);
+ }
+ 
  /* Returns latch edge of LOOP.  */
  edge
  loop_latch_edge (loop)
!      const struct loop *loop;
  {
    edge e;
  
*************** loop_latch_edge (loop)
*** 1265,1271 ****
  /* Returns preheader edge of LOOP.  */
  edge
  loop_preheader_edge (loop)
!      struct loop *loop;
  {
    edge e;
  
--- 1408,1414 ----
  /* Returns preheader edge of LOOP.  */
  edge
  loop_preheader_edge (loop)
!      const struct loop *loop;
  {
    edge e;
  
Index: cfgloop.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/cfgloop.h,v
retrieving revision 1.1.2.2
diff -c -3 -p -r1.1.2.2 cfgloop.h
*** cfgloop.h	19 Oct 2002 16:58:07 -0000	1.1.2.2
--- cfgloop.h	8 Nov 2002 12:14:13 -0000
*************** struct loop
*** 34,39 ****
--- 34,42 ----
    /* Basic block of loop pre-header or NULL if it does not exist.  */
    basic_block pre_header;
  
+   /* Histogram for a loop.  */
+   struct loop_histogram *histogram;
+ 
    /* Array of edges along the pre-header extended basic block trace.
       The source of the first edge is the root node of pre-header
       extended basic block, if it exists.  */
*************** struct loop
*** 144,149 ****
--- 147,168 ----
    int exit_count;
  };
  
+ /* Histogram of a loop.  */
+ struct loop_histogram
+ {
+   int steps;
+   gcov_type *counts;
+   gcov_type more;
+ };
+ 
+ /* Flags for state of loop structure.  */
+ enum
+ {
+   LOOPS_HAVE_PREHEADERS = 1,
+   LOOPS_HAVE_SIMPLE_LATCHES = 2,
+   LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS = 4,
+   LOOPS_HAVE_HISTOGRAMS_ON_EDGES = 8
+ };
  
  /* Structure to hold CFG information about natural loops within a function.  */
  struct loops
*************** struct loops
*** 182,187 ****
--- 201,209 ----
  
    /* Headers shared by multiple loops that should be merged.  */
    sbitmap shared_headers;
+ 
+   /* State of loops.  */
+   int state;
  };
  
  /* Description of loop for simple loop unrolling.  */
*************** struct loop_desc
*** 213,228 ****
  #define LOOP_EDGES		(LOOP_ENTRY_EDGES | LOOP_EXIT_EDGES)
  #define LOOP_ALL	       15	/* All of the above  */
  
- /* Flags for verify_loop_structure.  */
- enum
- {
-   VLS_EXPECT_PREHEADERS = 1,
-   VLS_EXPECT_SIMPLE_LATCHES = 2,
-   VLS_EXPECT_MARKED_IRREDUCIBLE_LOOPS = 4,
-   VLS_FOR_LOOP = VLS_EXPECT_PREHEADERS | VLS_EXPECT_SIMPLE_LATCHES
- 		     | VLS_EXPECT_MARKED_IRREDUCIBLE_LOOPS
- };
- 
  /* Loop recognition.  */
  extern int flow_loops_find		PARAMS ((struct loops *, int flags));
  extern int flow_loops_update		PARAMS ((struct loops *, int flags));
--- 235,240 ----
*************** extern int num_loop_insns		PARAMS ((stru
*** 250,261 ****
  
  /* Loops & cfg manipulation.  */
  extern basic_block *get_loop_body	PARAMS ((const struct loop *));
- extern int dfs_enumerate_from		PARAMS ((basic_block, int,
- 						bool (*)(basic_block, void *),
- 						basic_block *, int, void *));
  
! extern edge loop_preheader_edge		PARAMS ((struct loop *));
! extern edge loop_latch_edge		PARAMS ((struct loop *));
  
  extern void add_bb_to_loop		PARAMS ((basic_block, struct loop *));
  extern void remove_bb_from_loops	PARAMS ((basic_block));
--- 262,270 ----
  
  /* Loops & cfg manipulation.  */
  extern basic_block *get_loop_body	PARAMS ((const struct loop *));
  
! extern edge loop_preheader_edge		PARAMS ((const struct loop *));
! extern edge loop_latch_edge		PARAMS ((const struct loop *));
  
  extern void add_bb_to_loop		PARAMS ((basic_block, struct loop *));
  extern void remove_bb_from_loops	PARAMS ((basic_block));
*************** enum
*** 275,281 ****
  extern void create_preheaders		PARAMS ((struct loops *, int));
  extern void force_single_succ_latches	PARAMS ((struct loops *));
  
! extern void verify_loop_structure	PARAMS ((struct loops *, int));
  
  /* Loop analysis.  */
  extern bool simple_loop_p		PARAMS ((struct loops *, struct loop *,
--- 284,295 ----
  extern void create_preheaders		PARAMS ((struct loops *, int));
  extern void force_single_succ_latches	PARAMS ((struct loops *));
  
! extern void move_histograms_to_loops	PARAMS ((struct loops *));
! extern struct loop_histogram *copy_histogram PARAMS ((struct loop_histogram *, int));
! extern void add_histogram		PARAMS ((struct loop_histogram *, struct loop_histogram *, int));
! extern void free_histogram		PARAMS ((struct loop_histogram *));
! 
! extern void verify_loop_structure	PARAMS ((struct loops *));
  
  /* Loop analysis.  */
  extern bool simple_loop_p		PARAMS ((struct loops *, struct loop *,
*************** extern int expected_loop_iterations	PARA
*** 289,295 ****
  extern bool can_duplicate_loop_p	PARAMS ((struct loop *loop));
  
  #define DLTHE_FLAG_UPDATE_FREQ		1
! #define DLTHE_FLAG_ALL			(DLTHE_FLAG_UPDATE_FREQ)
  extern int duplicate_loop_to_header_edge PARAMS ((struct loop *, edge,
  						struct loops *, int,
  						sbitmap, edge, edge *,
--- 303,311 ----
  extern bool can_duplicate_loop_p	PARAMS ((struct loop *loop));
  
  #define DLTHE_FLAG_UPDATE_FREQ		1
! #define DLTHE_PROB_UPDATING(X)		(X & 6)
! #define DLTHE_USE_HISTOGRAM_PROB	0
! #define DLTHE_USE_WONT_EXIT		2
  extern int duplicate_loop_to_header_edge PARAMS ((struct loop *, edge,
  						struct loops *, int,
  						sbitmap, edge, edge *,
Index: cfgloopanal.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/cfgloopanal.c,v
retrieving revision 1.1.4.3
diff -c -3 -p -r1.1.4.3 cfgloopanal.c
*** cfgloopanal.c	27 Oct 2002 20:27:07 -0000	1.1.4.3
--- cfgloopanal.c	8 Nov 2002 12:14:13 -0000
*************** next:;
*** 958,963 ****
--- 958,964 ----
      free (edges[i]);
    free (edges);
    free (n_edges);
+   loops->state |= LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS;
  }
  
  /* Counts number of insns inside LOOP.  */
Index: cfgloopmanip.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/cfgloopmanip.c,v
retrieving revision 1.1.2.3
diff -c -3 -p -r1.1.2.3 cfgloopmanip.c
*** cfgloopmanip.c	3 Nov 2002 01:02:36 -0000	1.1.2.3
--- cfgloopmanip.c	8 Nov 2002 12:14:13 -0000
*************** Software Foundation, 59 Temple Place - S
*** 28,38 ****
  #include "output.h"
  
  static struct loop * duplicate_loop	PARAMS ((struct loops *,
! 						struct loop *, struct loop *));
  static void duplicate_subloops		PARAMS ((struct loops *, struct loop *,
! 						struct loop *));
  static void copy_loops_to		PARAMS ((struct loops *, struct loop **,
! 						int, struct loop *));
  static void loop_redirect_edge		PARAMS ((edge, basic_block));
  static bool loop_delete_branch_edge	PARAMS ((edge));
  static void copy_bbs			PARAMS ((basic_block *, int, edge,
--- 28,39 ----
  #include "output.h"
  
  static struct loop * duplicate_loop	PARAMS ((struct loops *,
! 						struct loop *, struct loop *, int));
! static void scale_loop_histograms	PARAMS ((struct loop *, int));
  static void duplicate_subloops		PARAMS ((struct loops *, struct loop *,
! 						struct loop *, int));
  static void copy_loops_to		PARAMS ((struct loops *, struct loop **,
! 						int, struct loop *, int));
  static void loop_redirect_edge		PARAMS ((edge, basic_block));
  static bool loop_delete_branch_edge	PARAMS ((edge));
  static void copy_bbs			PARAMS ((basic_block *, int, edge,
*************** fix_loop_placements (loop)
*** 589,594 ****
--- 590,609 ----
      }
  }
  
+ /* Scales histograms of LOOP and subloops by PROB.  */
+ static void
+ scale_loop_histograms (loop, prob)
+      struct loop *loop;
+      int prob;
+ {
+   struct loop *aloop;
+ 
+   for (aloop = loop->inner; aloop; aloop = aloop->next)
+     scale_loop_histograms (aloop, prob);
+   if (loop->histogram)
+     add_histogram (loop->histogram, loop->histogram, prob - REG_BR_PROB_BASE);
+ }
+ 
  /* Creates place for a new LOOP.  */
  static void
  place_new_loop (loops, loop)
*************** place_new_loop (loops, loop)
*** 602,613 ****
    loop->num = loops->num++;
  }
  
! /* Copies structure of LOOP into TARGET.  */
  static struct loop *
! duplicate_loop (loops, loop, target)
       struct loops *loops;
       struct loop *loop;
       struct loop *target;
  {
    struct loop *cloop;
    cloop = xcalloc (1, sizeof (struct loop));
--- 617,629 ----
    loop->num = loops->num++;
  }
  
! /* Copies structure of LOOP into TARGET, scaling histogram by PROB.  */
  static struct loop *
! duplicate_loop (loops, loop, target, prob)
       struct loops *loops;
       struct loop *loop;
       struct loop *target;
+      int prob;
  {
    struct loop *cloop;
    cloop = xcalloc (1, sizeof (struct loop));
*************** duplicate_loop (loops, loop, target)
*** 619,661 ****
    /* Set it as copy of loop.  */
    loop->copy = cloop;
  
    /* Add it to target.  */
    flow_loop_tree_node_add (target, cloop);
  
    return cloop;
  }
  
! /* Copies structure of subloops of LOOP into TARGET.  */
  static void 
! duplicate_subloops (loops, loop, target)
       struct loops *loops;
       struct loop *loop;
       struct loop *target;
  {
    struct loop *aloop, *cloop;
  
    for (aloop = loop->inner; aloop; aloop = aloop->next)
      {
!       cloop = duplicate_loop (loops, aloop, target);
!       duplicate_subloops (loops, aloop, cloop);
      }
  }
  
  /*  Copies structure of COPIED_LOOPS into TARGET.  */
  static void 
! copy_loops_to (loops, copied_loops, n, target)
       struct loops *loops;
       struct loop **copied_loops;
       int n;
       struct loop *target;
  {
    struct loop *aloop;
    int i;
  
    for (i = 0; i < n; i++)
      {
!       aloop = duplicate_loop (loops, copied_loops[i], target);
!       duplicate_subloops (loops, copied_loops[i], aloop);
      }
  }
  
--- 635,683 ----
    /* Set it as copy of loop.  */
    loop->copy = cloop;
  
+   /* Scale histogram.  */
+   if (loop->histogram)
+     cloop->histogram = copy_histogram (loop->histogram, prob);
+ 
    /* Add it to target.  */
    flow_loop_tree_node_add (target, cloop);
  
    return cloop;
  }
  
! /* Copies structure of subloops of LOOP into TARGET; histograms are scaled by PROB.  */
  static void 
! duplicate_subloops (loops, loop, target, prob)
       struct loops *loops;
       struct loop *loop;
       struct loop *target;
+      int prob;
  {
    struct loop *aloop, *cloop;
  
    for (aloop = loop->inner; aloop; aloop = aloop->next)
      {
!       cloop = duplicate_loop (loops, aloop, target, prob);
!       duplicate_subloops (loops, aloop, cloop, prob);
      }
  }
  
  /*  Copies structure of COPIED_LOOPS into TARGET.  */
  static void 
! copy_loops_to (loops, copied_loops, n, target, prob)
       struct loops *loops;
       struct loop **copied_loops;
       int n;
       struct loop *target;
+      int prob;
  {
    struct loop *aloop;
    int i;
  
    for (i = 0; i < n; i++)
      {
!       aloop = duplicate_loop (loops, copied_loops[i], target, prob);
!       duplicate_subloops (loops, copied_loops[i], aloop, prob);
      }
  }
  
*************** record_exit_edges (orig, bbs, nbbs, to_r
*** 901,908 ****
     to original LOOP body) into TO_REMOVE array.  Returns false if duplication
     is impossible.  */
  int
! duplicate_loop_to_header_edge (loop, e, loops, ndupl,
! 			       wont_exit, orig, to_remove, n_to_remove, flags)
       struct loop *loop;
       edge e;
       struct loops *loops;
--- 923,930 ----
     to original LOOP body) into TO_REMOVE array.  Returns false if duplication
     is impossible.  */
  int
! duplicate_loop_to_header_edge (loop, e, loops, ndupl, wont_exit, orig,
! 			       to_remove, n_to_remove, flags)
       struct loop *loop;
       edge e;
       struct loops *loops;
*************** duplicate_loop_to_header_edge (loop, e, 
*** 922,930 ****
    edge ae, latch_edge, he;
    int i, j, n;
    int is_latch = (latch == e->src);
!   int scale_act, scale_step, scale_main, p, freq_in, freq_le;
!   int prob_pass_thru;
    int add_irreducible_flag;
  
    if (e->dest != loop->header)
      abort ();
--- 944,953 ----
    edge ae, latch_edge, he;
    int i, j, n;
    int is_latch = (latch == e->src);
!   int scale_act, *scale_step, scale_main, p, freq_in, freq_le, freq_out_orig, hsteps;
!   int prob_pass_thru, prob_pass_wont_exit, prob_pass_main;
    int add_irreducible_flag;
+   gcov_type all_counters, iterations;
  
    if (e->dest != loop->header)
      abort ();
*************** duplicate_loop_to_header_edge (loop, e, 
*** 958,1004 ****
    /* Find edge from latch.  */
    latch_edge = loop_latch_edge (loop);
  
!   /* For updating frequencies.  */
!   freq_in = header->frequency;
!   freq_le = EDGE_FREQUENCY (latch_edge);
!   if (freq_in == 0)
!     freq_in = 1;
!   if (freq_in < freq_le)
!     freq_in = freq_le;
  
!   prob_pass_thru = RDIV (REG_BR_PROB_BASE * freq_le, freq_in);
!   scale_step = prob_pass_thru;
  
!   if (is_latch)
!     {
!       /* Should work well unless something inside depends on parity of
! 	 iteration counter.  */
!       p = REG_BR_PROB_BASE;
!       scale_main = 0;
!       for (i = 0; i < ndupl + 1; i++)
  	{
! 	  scale_main += p;
! 	  p = RDIV (p * scale_step, REG_BR_PROB_BASE);
  	}
-       scale_main = RDIV (REG_BR_PROB_BASE * REG_BR_PROB_BASE, scale_main);
-       scale_act = RDIV (scale_main * scale_step, REG_BR_PROB_BASE);
-     }
-   else
-     {
-       /* This is wrong; first iteration of cycle is certainly somewhat
- 	 special.  But I cannot do anything with it. */
  
!       scale_main = REG_BR_PROB_BASE;
        for (i = 0; i < ndupl; i++)
! 	scale_main = RDIV (scale_main * scale_step, REG_BR_PROB_BASE);
!       scale_act = REG_BR_PROB_BASE - scale_step;
!     }
!   if (scale_main < 0 || scale_main > REG_BR_PROB_BASE ||
!       scale_act < 0  || scale_act > REG_BR_PROB_BASE ||
!       scale_step < 0 || scale_step > REG_BR_PROB_BASE)
!     {
!       /* Something is wrong.  */
!       abort ();
      }
  
    /* Loop the new bbs will belong to.  */
--- 981,1083 ----
    /* Find edge from latch.  */
    latch_edge = loop_latch_edge (loop);
  
!   if (flags & DLTHE_FLAG_UPDATE_FREQ)
!     {
!       /* For updating frequencies.  */
!       freq_in = header->frequency;
!       freq_le = EDGE_FREQUENCY (latch_edge);
!       if (freq_in == 0)
! 	freq_in = 1;
!       if (freq_in < freq_le)
! 	freq_in = freq_le;
!       freq_out_orig = orig ? EDGE_FREQUENCY (orig) : freq_in - freq_le;
!       if (freq_out_orig > freq_in - freq_le)
! 	freq_out_orig = freq_in - freq_le;
!       prob_pass_thru = RDIV (REG_BR_PROB_BASE * freq_le, freq_in);
!       prob_pass_wont_exit = RDIV (REG_BR_PROB_BASE * (freq_le + freq_out_orig), freq_in);
  
!       scale_step = xmalloc (ndupl * sizeof (int));
  
!       switch (DLTHE_PROB_UPDATING (flags))
  	{
! 	  case DLTHE_USE_HISTOGRAM_PROB:
! 	    if (is_latch || !loop->histogram)
! 	      abort ();
! 	    iterations = latch_edge->count;
! 	    all_counters = loop->histogram->more;
! 	    for (i = 0; i < loop->histogram->steps; i++)
! 	      all_counters += loop->histogram->counts[i];
! 	    hsteps = ndupl;
! 	    if (hsteps > loop->histogram->steps)
! 	      hsteps = loop->histogram->steps;
! 	    for (i = 0; i < hsteps; i++)
! 	      {
! 		if (all_counters)
! 		  scale_step[i] = (all_counters - loop->histogram->counts[i]) * REG_BR_PROB_BASE / all_counters;
! 		else
! 		  scale_step[i] = 0;
! 		all_counters -= loop->histogram->counts[i];
! 		iterations -= i * loop->histogram->counts[i];
! 	      }
! 	    if (iterations < 0)
! 	      iterations = 0;
! 	    if (iterations)
! 	      p = iterations * REG_BR_PROB_BASE / (iterations + all_counters);
! 	    else
! 	      p = 0;
! 	    for (; i < ndupl; i++)
! 	      scale_step[i] = p;
! 
! 	    /* Update histogram.  */
! 	    if (ndupl >= loop->histogram->steps)
! 	      {
! 		free (loop->histogram->counts);
! 		free (loop->histogram);
! 		loop->histogram = NULL;
! 	      }
! 	    else
! 	      {
! 		for (i = ndupl; i < loop->histogram->steps; i++)
! 		  loop->histogram->counts[i - ndupl] = loop->histogram->counts[i];
! 		loop->histogram->steps -= ndupl;
! 	      }
! 	    break;
! 
! 	  case DLTHE_USE_WONT_EXIT:
! 	    for (i = 1; i <= ndupl; i++)
! 	      scale_step[i - 1] = TEST_BIT (wont_exit, i) ? prob_pass_wont_exit : prob_pass_thru;
! 	    break;
! 
! 	  default:
! 	    abort ();
  	}
  
!       if (is_latch)
! 	{
! 	  prob_pass_main = TEST_BIT (wont_exit, 0) ? prob_pass_wont_exit : prob_pass_thru;
! 	  p = prob_pass_main;
! 	  scale_main = REG_BR_PROB_BASE;
! 	  for (i = 0; i < ndupl; i++)
! 	    {
! 	      scale_main += p;
! 	      p = RDIV (p * scale_step[i], REG_BR_PROB_BASE);
! 	    }
! 	  scale_main = RDIV (REG_BR_PROB_BASE * REG_BR_PROB_BASE, scale_main);
! 	  scale_act = RDIV (scale_main * prob_pass_main, REG_BR_PROB_BASE);
! 	}
!       else
! 	{
! 	  scale_main = REG_BR_PROB_BASE;
! 	  for (i = 0; i < ndupl; i++)
! 	    scale_main = RDIV (scale_main * scale_step[i], REG_BR_PROB_BASE);
! 	  scale_act = REG_BR_PROB_BASE - prob_pass_thru;
! 	}
        for (i = 0; i < ndupl; i++)
! 	if (scale_step[i] < 0 || scale_step[i] > REG_BR_PROB_BASE)
! 	  abort ();
!       if (scale_main < 0 || scale_main > REG_BR_PROB_BASE
! 	  || scale_act < 0  || scale_act > REG_BR_PROB_BASE)
! 	abort ();
      }
  
    /* Loop the new bbs will belong to.  */
*************** duplicate_loop_to_header_edge (loop, e, 
*** 1031,1037 ****
    for (j = 0; j < ndupl; j++)
      {
        /* Copy loops.  */
!       copy_loops_to (loops, orig_loops, n_orig_loops, target);
  
        /* Copy bbs.  */
        copy_bbs (bbs, n, e, latch_edge, &new_bbs, loops, &e, &he, add_irreducible_flag);
--- 1110,1116 ----
    for (j = 0; j < ndupl; j++)
      {
        /* Copy loops.  */
!       copy_loops_to (loops, orig_loops, n_orig_loops, target, scale_act);
  
        /* Copy bbs.  */
        copy_bbs (bbs, n, e, latch_edge, &new_bbs, loops, &e, &he, add_irreducible_flag);
*************** duplicate_loop_to_header_edge (loop, e, 
*** 1064,1070 ****
      	    ae->count = RDIV (new_bb->count * ae->probability,
  			      REG_BR_PROB_BASE);
  	}
!       scale_act = RDIV (scale_act * scale_step, REG_BR_PROB_BASE);
  
        if (!first_active_latch)
  	{
--- 1143,1150 ----
      	    ae->count = RDIV (new_bb->count * ae->probability,
  			      REG_BR_PROB_BASE);
  	}
!       if (flags & DLTHE_FLAG_UPDATE_FREQ)
! 	scale_act = RDIV (scale_act * scale_step[j], REG_BR_PROB_BASE);
  
        if (!first_active_latch)
  	{
*************** duplicate_loop_to_header_edge (loop, e, 
*** 1086,1093 ****
  	       latch_edge = latch_edge->pred_next);
  	}
      }
-   free (orig_loops);
- 
    /* Now handle original loop.  */
    
    /* Update edge counts.  */
--- 1166,1171 ----
*************** duplicate_loop_to_header_edge (loop, e, 
*** 1101,1107 ****
--- 1179,1189 ----
  	  for (ae = bb->succ; ae; ae = ae->succ_next)
  	    ae->count = RDIV (bb->count * ae->probability, REG_BR_PROB_BASE);
  	}
+       for (i = 0; i < n_orig_loops; i++)
+ 	scale_loop_histograms (orig_loops[i], scale_main);
+       free (scale_step);
      }
+   free (orig_loops);
  
    /* Update dominators of other blocks if affected.  */
    for (i = 0; i < n; i++)
*************** create_preheader (loop, dom, flags)
*** 1211,1216 ****
--- 1293,1299 ----
  	  add_to_dominance_info (dom, jump);
  	  set_immediate_dominator (dom, jump, src);
  	  add_bb_to_loop (jump, loop);
+ 	  loop->latch = jump;
  	}
      }
  
*************** create_preheaders (loops, flags)
*** 1236,1241 ****
--- 1319,1325 ----
    int i;
    for (i = 1; i < loops->num; i++)
      create_preheader (loops->parray[i], loops->cfg.dom, flags);
+   loops->state |= LOOPS_HAVE_PREHEADERS;
  }
  
  /* Forces all loop latches to have only single successor.  */
*************** force_single_succ_latches (loops)
*** 1256,1261 ****
--- 1340,1346 ----
        for (e = loop->header->pred; e->src != loop->latch; e = e->pred_next);
  	loop_split_edge_with (e, NULL_RTX, loops);
      }
+   loops->state |= LOOPS_HAVE_SIMPLE_LATCHES;
  }
  
  /* A quite stupid function to put INSNS on E. They are supposed to form
*************** loop_split_edge_with (e, insns, loops)
*** 1293,1298 ****
--- 1378,1386 ----
    new_e = make_edge (new_bb, dest, EDGE_FALLTHRU);
    new_e->probability = REG_BR_PROB_BASE;
    new_e->count = e->count;
+ 
+   new_e->loop_histogram = e->loop_histogram;
+   e->loop_histogram = NULL;
  
    new_bb->count = e->count;
  
Index: cfgrtl.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cfgrtl.c,v
retrieving revision 1.58.4.1
diff -c -3 -p -r1.58.4.1 cfgrtl.c
*** cfgrtl.c	16 Oct 2002 16:07:08 -0000	1.58.4.1
--- cfgrtl.c	8 Nov 2002 12:14:13 -0000
*************** merge_blocks_nomove (a, b)
*** 544,549 ****
--- 544,550 ----
    rtx del_first = NULL_RTX, del_last = NULL_RTX;
    int b_empty = 0;
    edge e;
+   struct loop_histogram *histogram = NULL;
  
    /* If there was a CODE_LABEL beginning B, delete it.  */
    if (GET_CODE (b_head) == CODE_LABEL)
*************** merge_blocks_nomove (a, b)
*** 607,613 ****
       be merging a TEST block with THEN and ELSE successors.  Free the
       whole lot of them and hope the caller knows what they're doing.  */
    while (a->succ)
!     remove_edge (a->succ);
  
    /* Adjust the edges out of B for the new owner.  */
    for (e = b->succ; e; e = e->succ_next)
--- 608,622 ----
       be merging a TEST block with THEN and ELSE successors.  Free the
       whole lot of them and hope the caller knows what they're doing.  */
    while (a->succ)
!     {
!       if (a->succ->loop_histogram)
! 	{
! 	  if (histogram)
! 	    abort ();
! 	  histogram = a->succ->loop_histogram;
! 	}
!       remove_edge (a->succ);
!     }
  
    /* Adjust the edges out of B for the new owner.  */
    for (e = b->succ; e; e = e->succ_next)
*************** merge_blocks_nomove (a, b)
*** 639,644 ****
--- 648,660 ----
      }
  
    a->end = a_end;
+ 
+   if (histogram)
+     {
+       if (!a->pred || a->pred->pred_next || a->pred->loop_histogram)
+ 	abort ();
+       a->pred->loop_histogram = histogram;
+     }
  }
  
  /* Return the label in the head of basic block BLOCK.  Create one if it doesn't
Index: flags.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/flags.h,v
retrieving revision 1.88.2.2
diff -c -3 -p -r1.88.2.2 flags.h
*** flags.h	23 Oct 2002 15:48:19 -0000	1.88.2.2
--- flags.h	8 Nov 2002 12:14:14 -0000
*************** extern int profile_flag;
*** 197,202 ****
--- 197,206 ----
  
  extern int profile_arc_flag;
  
+ /* Nonzero if generating/using loop histograms.  */
+ 
+ extern int flag_loop_histograms;
+ 
  /* Nonzero if generating info for gcov to calculate line test coverage.  */
  
  extern int flag_test_coverage;
Index: gcov-io.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gcov-io.h,v
retrieving revision 1.16.6.1
diff -c -3 -p -r1.16.6.1 gcov-io.h
*** gcov-io.h	16 Oct 2002 16:07:19 -0000	1.16.6.1
--- gcov-io.h	8 Nov 2002 12:14:14 -0000
*************** typedef long long gcov_type;
*** 175,180 ****
--- 175,181 ----
  #define GCOV_TAG_ARCS		 ((unsigned)0x01430000)
  #define GCOV_TAG_LINES		 ((unsigned)0x01450000)
  #define GCOV_TAG_ARC_COUNTS  	 ((unsigned)0x01a10000)
+ #define GCOV_TAG_LOOP_HISTOGRAMS ((unsigned)0x01a30000)
  #define GCOV_TAG_OBJECT_SUMMARY  ((unsigned)0xa1000000)
  #define GCOV_TAG_PROGRAM_SUMMARY ((unsigned)0xa3000000)
  #define GCOV_TAG_PLACEHOLDER_SUMMARY ((unsigned)0xa5000000)
*************** struct function_info
*** 226,231 ****
--- 227,234 ----
    const char *name;	        /* (mangled) name of function */
    unsigned checksum;		/* function checksum */
    unsigned n_arc_counts;	/* number of instrumented arcs */
+   unsigned n_loop_histogram_counters;
+   				/* number of histogram counters */
  };
  
  /* Information about a single object file.  */
*************** struct gcov_info
*** 242,247 ****
--- 245,252 ----
  
    gcov_type *arc_counts;	/* table of arc counts */
    unsigned n_arc_counts;	/* number of arc counts */
+   gcov_type *histogram_counts;	/* table of loop histogram counters */
+   unsigned n_histogram_counts;	/* number of histogram counts */
  };
  
  /* Register a new object file module.  */
*************** gcov_write_length (file, place)
*** 453,458 ****
--- 458,464 ----
    return result;
  }
  
+ #define GCOV_SUMMARY_LENGTH 44
  static int
  gcov_read_summary (da_file, summary)
       FILE *da_file;
Index: libgcc2.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/libgcc2.c,v
retrieving revision 1.147.2.2
diff -c -3 -p -r1.147.2.2 libgcc2.c
*** libgcc2.c	30 Oct 2002 22:08:17 -0000	1.147.2.2
--- libgcc2.c	8 Nov 2002 12:14:14 -0000
*************** gcov_exit (void)
*** 1311,1318 ****
        int merging = 0;
        long base;
        const struct function_info *fn_info;
!       gcov_type *count_ptr;
        gcov_type object_max_one = 0;
  
        ptr->wkspc = 0;
        if (!ptr->filename)
--- 1311,1320 ----
        int merging = 0;
        long base;
        const struct function_info *fn_info;
!       gcov_type *count_ptr, *histograms_ptr;
        gcov_type object_max_one = 0;
+       gcov_type count;
+       unsigned tag, length, flength, checksum;
  
        ptr->wkspc = 0;
        if (!ptr->filename)
*************** gcov_exit (void)
*** 1354,1360 ****
        if (merging)
  	{
  	  /* Merge data from file.  */
- 	  unsigned tag, length;
  	      
  	  if (gcov_read_unsigned (da_file, &tag) || tag != GCOV_DATA_MAGIC)
  	    {
--- 1356,1361 ----
*************** gcov_exit (void)
*** 1373,1378 ****
--- 1374,1380 ----
  	  
  	  /* Merge execution counts for each function.  */
  	  count_ptr = ptr->arc_counts;
+ 	  histograms_ptr = ptr->histogram_counts;
  	  for (ix = ptr->n_functions, fn_info = ptr->functions;
  	       ix--; fn_info++)
  	    {
*************** gcov_exit (void)
*** 1393,1409 ****
  			   ptr->filename, fn_info->name);
  		  goto read_fatal;
  		}
! 	      {
! 		unsigned flength, checksum;
! 		
! 		if (gcov_read_unsigned (da_file, &flength)
! 		    || gcov_skip_string (da_file, flength)
! 		    || gcov_read_unsigned (da_file, &checksum))
! 		  goto read_error;
! 		if (flength != strlen (fn_info->name)
! 		    || checksum != fn_info->checksum)
! 		  goto read_mismatch;
! 	      }
  	      /* Check arc counts */
  	      if (gcov_read_unsigned (da_file, &tag)
  		  || gcov_read_unsigned (da_file, &length))
--- 1395,1409 ----
  			   ptr->filename, fn_info->name);
  		  goto read_fatal;
  		}
! 
! 	      if (gcov_read_unsigned (da_file, &flength)
! 		  || gcov_skip_string (da_file, flength)
! 		  || gcov_read_unsigned (da_file, &checksum))
! 		goto read_error;
! 	      if (flength != strlen (fn_info->name)
! 		  || checksum != fn_info->checksum)
! 		goto read_mismatch;
! 
  	      /* Check arc counts */
  	      if (gcov_read_unsigned (da_file, &tag)
  		  || gcov_read_unsigned (da_file, &length))
*************** gcov_exit (void)
*** 1411,1425 ****
  	      if (tag != GCOV_TAG_ARC_COUNTS
  		  || length / 8 != fn_info->n_arc_counts)
  		goto read_mismatch;
- 	      {
- 		gcov_type count;
  		
! 		for (jx = fn_info->n_arc_counts; jx--; count_ptr++)
! 		  if (gcov_read_counter (da_file, &count))
! 		    goto read_error;
! 		  else
! 		    *count_ptr += count;
! 	      }
  	    }
  
  	  /* Check object summary */
--- 1411,1436 ----
  	      if (tag != GCOV_TAG_ARC_COUNTS
  		  || length / 8 != fn_info->n_arc_counts)
  		goto read_mismatch;
  		
!       	      for (jx = fn_info->n_arc_counts; jx--; count_ptr++)
! 		if (gcov_read_counter (da_file, &count))
! 		  goto read_error;
! 		else
! 		  *count_ptr += count;
! 
! 	      /* Loop histograms. */ 
! 	      if (gcov_read_unsigned (da_file, &tag)
! 		  || gcov_read_unsigned (da_file, &length))
! 		goto read_error;
! 	      if (tag != GCOV_TAG_LOOP_HISTOGRAMS
! 		  || length / 8 != fn_info->n_loop_histogram_counters)
! 		goto read_mismatch;
! 		
!       	      for (jx = fn_info->n_loop_histogram_counters; jx--; histograms_ptr++)
! 		if (gcov_read_counter (da_file, &count))
! 		  goto read_error;
! 		else
! 		  *histograms_ptr += count;
  	    }
  
  	  /* Check object summary */
*************** gcov_exit (void)
*** 1499,1504 ****
--- 1510,1516 ----
        
        /* Write execution counts for each function.  */
        count_ptr = ptr->arc_counts;
+       histograms_ptr = ptr->histogram_counts;
        for (ix = ptr->n_functions, fn_info = ptr->functions; ix--; fn_info++)
  	{
  	  /* Announce function. */
*************** gcov_exit (void)
*** 1529,1534 ****
--- 1541,1562 ----
  	    }
  	  if (gcov_write_length (da_file, base))
  	    goto write_error;
+ 
+ 	  /* loop histograms.  */
+ 	  if (gcov_write_unsigned (da_file, GCOV_TAG_LOOP_HISTOGRAMS)
+ 	      || !(base = gcov_reserve_length (da_file)))
+ 	    goto write_error;
+ 	  
+ 	  for (jx = fn_info->n_loop_histogram_counters; jx--;)
+ 	    {
+ 	      gcov_type count = *histograms_ptr++;
+ 	      
+ 	      object.arc_sum += count;
+ 	      if (gcov_write_counter (da_file, count))
+ 		goto write_error;
+ 	    }
+ 	  if (gcov_write_length (da_file, base))
+ 	    goto write_error;
  	}
  
        /* Object file summary. */
*************** __gcov_flush (void)
*** 1668,1673 ****
--- 1696,1703 ----
        
        for (i = ptr->n_arc_counts; i--;)
  	ptr->arc_counts[i] = 0;
+       for (i = ptr->n_histogram_counts; i--;)
+ 	ptr->histogram_counts[i] = 0;
      }
  }
  
Index: loop-init.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/loop-init.c,v
retrieving revision 1.1.2.1
diff -c -3 -p -r1.1.2.1 loop-init.c
*** loop-init.c	19 Oct 2002 15:20:23 -0000	1.1.2.1
--- loop-init.c	8 Nov 2002 12:14:14 -0000
*************** Software Foundation, 59 Temple Place - S
*** 26,31 ****
--- 26,32 ----
  #include "basic-block.h"
  #include "cfgloop.h"
  #include "cfglayout.h"
+ #include "profile.h"
  
  /* Initialize loop optimizer.  */
  
*************** loop_optimizer_init (dumpfile)
*** 64,75 ****
    /* Mark irreducible loops.  */
    mark_irreducible_loops (loops);
  
    /* Dump loops.  */
    flow_loops_dump (loops, dumpfile, NULL, 1);
  
  #ifdef ENABLE_CHECKING
    verify_dominators (loops->cfg.dom);
!   verify_loop_structure (loops, VLS_FOR_LOOP);
  #endif
  
    return loops;
--- 65,80 ----
    /* Mark irreducible loops.  */
    mark_irreducible_loops (loops);
  
+   /* Do we have histograms?  */
+   if (profile_info.have_loop_histograms)
+     move_histograms_to_loops (loops);
+ 
    /* Dump loops.  */
    flow_loops_dump (loops, dumpfile, NULL, 1);
  
  #ifdef ENABLE_CHECKING
    verify_dominators (loops->cfg.dom);
!   verify_loop_structure (loops);
  #endif
  
    return loops;
Index: loop-unroll.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/loop-unroll.c,v
retrieving revision 1.1.2.9
diff -c -3 -p -r1.1.2.9 loop-unroll.c
*** loop-unroll.c	3 Nov 2002 01:02:36 -0000	1.1.2.9
--- loop-unroll.c	8 Nov 2002 12:14:14 -0000
*************** unroll_and_peel_loops (loops, flags)
*** 67,73 ****
        unroll_or_peel_loop (loops, loop, flags);
  #ifdef ENABLE_CHECKING
        verify_dominators (loops->cfg.dom);
!       verify_loop_structure (loops, VLS_FOR_LOOP);
  #endif
        loop = next;
      }
--- 67,73 ----
        unroll_or_peel_loop (loops, loop, flags);
  #ifdef ENABLE_CHECKING
        verify_dominators (loops->cfg.dom);
!       verify_loop_structure (loops);
  #endif
        loop = next;
      }
*************** peel_loop_completely (loops, loop, desc)
*** 101,107 ****
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  	loops, npeel + 1,
  	wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 	DLTHE_FLAG_ALL))
      abort ();
  
    free (wont_exit);
--- 101,107 ----
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  	loops, npeel + 1,
  	wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 	DLTHE_FLAG_UPDATE_FREQ | (loop->histogram ? DLTHE_USE_HISTOGRAM_PROB : DLTHE_USE_WONT_EXIT)))
      abort ();
  
    free (wont_exit);
*************** peel_loop_completely (loops, loop, desc)
*** 122,128 ****
    remove_path (loops, e);
  
    if (rtl_dump_file)
!     fprintf (rtl_dump_file, ";; Peeled loop completely, %d times\n",npeel);
  
    return true;
  }
--- 122,128 ----
    remove_path (loops, e);
  
    if (rtl_dump_file)
!     fprintf (rtl_dump_file, ";; Peeled loop completely, %d times\n", (int) npeel);
  
    return true;
  }
*************** unroll_loop_constant_iterations (loops, 
*** 202,208 ****
  	  && !duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, exit_mod,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
  	abort ();
  
        SET_BIT (wont_exit, 1);
--- 202,208 ----
  	  && !duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, exit_mod,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | (loop->histogram ? DLTHE_USE_HISTOGRAM_PROB : DLTHE_USE_WONT_EXIT)))
  	abort ();
  
        SET_BIT (wont_exit, 1);
*************** unroll_loop_constant_iterations (loops, 
*** 227,233 ****
  	  if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, exit_mod + 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
  	    abort ();
  
  	  SET_BIT (wont_exit, 0);
--- 227,233 ----
  	  if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, exit_mod + 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | (loop->histogram ? DLTHE_USE_HISTOGRAM_PROB : DLTHE_USE_WONT_EXIT)))
  	    abort ();
  
  	  SET_BIT (wont_exit, 0);
*************** unroll_loop_constant_iterations (loops, 
*** 241,247 ****
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
  		loops, max_unroll,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
      abort ();
  
    free (wont_exit);
--- 241,247 ----
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
  		loops, max_unroll,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | DLTHE_USE_WONT_EXIT))
      abort ();
  
    free (wont_exit);
*************** unroll_loop_constant_iterations (loops, 
*** 251,256 ****
--- 251,264 ----
      remove_path (loops, remove_edges[i]);
    free (remove_edges);
  
+   if (loop->histogram)
+     {
+       /* We could try to update histogram, but there is no sense as the information
+ 	 is not used any more.  */
+       free_histogram (loop->histogram);
+       loop->histogram = NULL;
+     }
+ 
    if (rtl_dump_file)
      fprintf (rtl_dump_file, ";; Unrolled loop %d times, constant # of iterations %i insns\n",max_unroll, num_loop_insns (loop));
    
*************** unroll_loop_runtime_iterations (loops, l
*** 350,356 ****
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
      abort ();
  
    /* Record the place where switch will be built for preconditioning.  */
--- 358,364 ----
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | DLTHE_USE_WONT_EXIT))
      abort ();
  
    /* Record the place where switch will be built for preconditioning.  */
*************** unroll_loop_runtime_iterations (loops, l
*** 366,372 ****
        if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
      	abort ();
  
        if (i != n_peel)
--- 374,380 ----
        if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
  		loops, 1,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | DLTHE_USE_WONT_EXIT))
      	abort ();
  
        if (i != n_peel)
*************** unroll_loop_runtime_iterations (loops, l
*** 374,379 ****
--- 382,388 ----
  	  /* Create item for switch.  */
  	  j = n_peel - i - (extra_zero_check ? 0 : 1);
  	  p = REG_BR_PROB_BASE / (i + 2);
+ 
  	  preheader = loop_split_edge_with (loop_preheader_edge (loop),
  					    NULL_RTX, loops);
  	  label = block_label (preheader);
*************** unroll_loop_runtime_iterations (loops, l
*** 439,445 ****
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
  	 	loops, max_unroll,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_ALL))
      abort ();
  
    free (wont_exit);
--- 448,454 ----
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
  	 	loops, max_unroll,
  		wont_exit, desc->out_edge, remove_edges, &n_remove_edges,
! 		DLTHE_FLAG_UPDATE_FREQ | DLTHE_USE_WONT_EXIT))
      abort ();
  
    free (wont_exit);
*************** unroll_loop_runtime_iterations (loops, l
*** 449,454 ****
--- 458,471 ----
      remove_path (loops, remove_edges[i]);
    free (remove_edges);
  
+   if (loop->histogram)
+     {
+       /* We could try to update histogram, but there is no sense as the information
+ 	 is not used any more.  */
+       free_histogram (loop->histogram);
+       loop->histogram = NULL;
+     }
+ 
    if (rtl_dump_file)
      fprintf (rtl_dump_file,
  	     ";; Unrolled loop %d times, counting # of iterations in runtime, %i insns\n",
*************** peel_loop_simple (loops, loop, npeel)
*** 470,476 ****
    sbitmap_zero (wont_exit);
  
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
! 		loops, npeel, wont_exit, NULL, NULL, NULL, DLTHE_FLAG_ALL))
      {
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ";; Peeling unsuccessful\n");
--- 487,494 ----
    sbitmap_zero (wont_exit);
  
    if (!duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
! 		loops, npeel, wont_exit, NULL, NULL, NULL,
! 		DLTHE_FLAG_UPDATE_FREQ | (loop->histogram ? DLTHE_USE_HISTOGRAM_PROB : DLTHE_USE_WONT_EXIT)))
      {
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ";; Peeling unsuccessful\n");
*************** unroll_loop_stupid (loops, loop, nunroll
*** 498,504 ****
    sbitmap_zero (wont_exit);
  
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
! 		loops, nunroll, wont_exit, NULL, NULL, NULL, DLTHE_FLAG_ALL))
      {
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ";;  Not unrolling loop, can't duplicate\n");
--- 516,523 ----
    sbitmap_zero (wont_exit);
  
    if (!duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
! 		loops, nunroll, wont_exit, NULL, NULL, NULL,
! 		DLTHE_FLAG_UPDATE_FREQ | DLTHE_USE_WONT_EXIT))
      {
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ";;  Not unrolling loop, can't duplicate\n");
*************** unroll_loop_stupid (loops, loop, nunroll
*** 506,511 ****
--- 525,539 ----
      }
  
    free (wont_exit);
+   
+   if (loop->histogram)
+     {
+       /* We could try to update histogram, but there is no sense as the information
+ 	 is not used any more.  */
+       free_histogram (loop->histogram);
+       loop->histogram = NULL;
+     }
+ 
    if (rtl_dump_file)
      fprintf (rtl_dump_file, ";; Unrolled loop %d times, %i insns\n",
  	     nunroll, num_loop_insns (loop));
*************** unroll_or_peel_loop (loops, loop, flags)
*** 520,527 ****
       struct loop *loop;
       int flags;
  {
!   int ninsns;
!   unsigned HOST_WIDE_INT nunroll, npeel, npeel_completely, peel_once, niter = 0;
    struct loop_desc desc;
    bool simple, exact;
  
--- 548,556 ----
       struct loop *loop;
       int flags;
  {
!   int ninsns, i;
!   unsigned HOST_WIDE_INT niter = 0;
!   unsigned nunroll, npeel, npeel_completely, peel_once;
    struct loop_desc desc;
    bool simple, exact;
  
*************** unroll_or_peel_loop (loops, loop, flags)
*** 644,666 ****
    if (exact)
      {
        /* If estimate is good, use it to decide and bound number of peelings.  */
!       if (niter + 1 > npeel)
  	{
  	  if ((flags & UAP_PEEL) && rtl_dump_file)
! 	    fprintf (rtl_dump_file,
! 		     ";; Not peeling loop, rolls too much (%d iterations > %d [maximum peelings])\n",
! 		     niter + 1, npeel);
  	  flags &= ~UAP_PEEL;
  	}
        if (niter + 1 > npeel_completely)
  	{
  	  if ((flags & UAP_PEEL_COMPLETELY) && rtl_dump_file)
! 	    fprintf (rtl_dump_file,
! 		     ";; Not peeling loop completely, rolls too much (%d iterations > %d [maximum peelings])\n",
! 		     niter + 1, npeel_completely);
  	  flags &= ~UAP_PEEL_COMPLETELY;
  	}
-       npeel = niter + 1;
  
        /* And unrollings.  */
        if (niter < 2 * nunroll)
--- 673,738 ----
    if (exact)
      {
        /* If estimate is good, use it to decide and bound number of peelings.  */
! 
!       /* If we have histogram, use it.  */
!       if (loop->histogram)
! 	{
! 	  gcov_type all_counters, act_counters;
! 	  all_counters = loop->histogram->more;
! 	  for (i = 0; i < loop->histogram->steps; i++)
! 	    all_counters += loop->histogram->counts[i];
! 	  act_counters = 0;
! 	  if (!all_counters)
! 	    {
! 	      if ((flags & UAP_PEEL) && rtl_dump_file)
! 		fprintf (rtl_dump_file, ";; Not peeling loop, never exits\n");
! 	      flags &= ~UAP_PEEL;
! 	    }
! 	  else
! 	    {
! 	      for (i = 0; i < loop->histogram->steps; i++)
! 		{
! 		  if (act_counters * REG_BR_PROB_BASE >= all_counters * PARAM_VALUE (PARAM_HISTOGRAM_PEEL_RATIO))
! 		    break;
! 		  act_counters += loop->histogram->counts[i];
! 		}
! 	      if (act_counters * REG_BR_PROB_BASE < all_counters * PARAM_VALUE (PARAM_HISTOGRAM_PEEL_RATIO))
! 		i = npeel + 1;
! 	      if ((unsigned) i > npeel)
! 		{
! 		  if ((flags & UAP_PEEL) && rtl_dump_file)
! 		    fprintf (rtl_dump_file,
! 		  	     ";; Not peeling loop, rolls too much (%d iterations > %d [maximum peelings])\n",
! 		  	     i, (int) npeel);
! 		  flags &= ~UAP_PEEL;
! 		}
! 	      else
! 		npeel = i;
! 	    }
! 	}
!       else if (niter + 1 > npeel)
  	{
  	  if ((flags & UAP_PEEL) && rtl_dump_file)
! 	    {
! 	      fprintf (rtl_dump_file, ";; Not peeling loop, rolls too much (");
! 	      fprintf (rtl_dump_file, HOST_WIDEST_INT_PRINT_DEC, (HOST_WIDEST_INT) (niter + 1));
! 	      fprintf (rtl_dump_file, " iterations > %d [maximum peelings])\n", npeel);
! 	    }
  	  flags &= ~UAP_PEEL;
  	}
+       else
+        	npeel = niter + 1;
+ 
        if (niter + 1 > npeel_completely)
  	{
  	  if ((flags & UAP_PEEL_COMPLETELY) && rtl_dump_file)
! 	    {
! 	      fprintf (rtl_dump_file, ";; Not peeling loop completely, rolls too much (");
! 	      fprintf (rtl_dump_file, HOST_WIDEST_INT_PRINT_DEC,(HOST_WIDEST_INT) (niter + 1));
! 	      fprintf (rtl_dump_file, "iterations > %d [maximum peelings])\n", npeel_completely);
! 	    }
  	  flags &= ~UAP_PEEL_COMPLETELY;
  	}
  
        /* And unrollings.  */
        if (niter < 2 * nunroll)
Index: loop-unswitch.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/loop-unswitch.c,v
retrieving revision 1.1.2.1
diff -c -3 -p -r1.1.2.1 loop-unswitch.c
*** loop-unswitch.c	19 Oct 2002 15:20:23 -0000	1.1.2.1
--- loop-unswitch.c	8 Nov 2002 12:14:14 -0000
*************** unswitch_loops (loops)
*** 61,67 ****
        unswitch_single_loop (loops, loop, NULL_RTX, 0);
  #ifdef ENABLE_CHECKING
        verify_dominators (loops->cfg.dom);
!       verify_loop_structure (loops, VLS_FOR_LOOP);
  #endif
      }
  }
--- 61,67 ----
        unswitch_single_loop (loops, loop, NULL_RTX, 0);
  #ifdef ENABLE_CHECKING
        verify_dominators (loops->cfg.dom);
!       verify_loop_structure (loops);
  #endif
      }
  }
*************** unswitch_loop (loops, loop, unswitch_on)
*** 290,295 ****
--- 290,296 ----
    struct loop *nloop;
    sbitmap zero_bitmap;
    int irred_flag;
+   int prob;
  
    /* Some sanity checking.  */
    if (!flow_bb_inside_loop_p (loop, unswitch_on))
*************** unswitch_loop (loops, loop, unswitch_on)
*** 342,348 ****
         latch_edge = latch_edge->succ_next);
    nloop = loopify (loops, latch_edge,
  		   RBI (loop->header)->copy->pred, switch_bb);
!  
    /* Remove paths from loop copies.  We rely on the fact that
       cfg_layout_duplicate_bb reverses list of edges here.  */
    for (e = unswitch_on->succ->succ_next->dest->pred; e; e = e->pred_next)
--- 343,356 ----
         latch_edge = latch_edge->succ_next);
    nloop = loopify (loops, latch_edge,
  		   RBI (loop->header)->copy->pred, switch_bb);
! 
!   /* Fix histograms.  */
!   if (loop->histogram)
!     {
!       nloop->histogram = copy_histogram (loop->histogram, prob);
!       add_histogram (loop->histogram, nloop->histogram, -REG_BR_PROB_BASE);
!     }
! 
    /* Remove paths from loop copies.  We rely on the fact that
       cfg_layout_duplicate_bb reverses list of edges here.  */
    for (e = unswitch_on->succ->succ_next->dest->pred; e; e = e->pred_next)
Index: params.def
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/params.def,v
retrieving revision 1.15.6.5
diff -c -3 -p -r1.15.6.5 params.def
*** params.def	1 Nov 2002 19:20:57 -0000	1.15.6.5
--- params.def	8 Nov 2002 12:14:14 -0000
*************** DEFPARAM(PARAM_MAX_UNROLL_TIMES,
*** 155,160 ****
--- 155,165 ----
  	"max-unroll-times",
  	"The maximum number of unrollings of a single loop",
  	8)
+ /* The ratio of iterations that should end in peeled part of a loop.  */
+ DEFPARAM(PARAM_HISTOGRAM_PEEL_RATIO,
+ 	"histogram-peel-radtio",
+ 	"The ratio of iterations that should end in peeled part of a loop",
+ 	9000)
  /* The maximum number of insns of a peeled loop.  */
  DEFPARAM(PARAM_MAX_PEELED_INSNS,
  	"max-peeled-insns",
Index: profile.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/profile.c,v
retrieving revision 1.97.2.1
diff -c -3 -p -r1.97.2.1 profile.c
*** profile.c	16 Oct 2002 16:07:24 -0000	1.97.2.1
--- profile.c	8 Nov 2002 12:14:14 -0000
*************** Software Foundation, 59 Temple Place - S
*** 62,72 ****
--- 62,75 ----
  #include "ggc.h"
  #include "hard-reg-set.h"
  #include "basic-block.h"
+ #include "cfgloop.h"
+ #include "params.h"
  #include "gcov-io.h"
  #include "target.h"
  #include "profile.h"
  #include "libfuncs.h"
  #include "langhooks.h"
+ #include "hashtab.h"
  
  /* Additional information about the edges we need.  */
  struct edge_info {
*************** struct function_list
*** 94,99 ****
--- 97,103 ----
    const char *name; 		/* function name */
    unsigned cfg_checksum;	/* function checksum */
    unsigned count_edges;	        /* number of intrumented edges  */
+   unsigned histogram_counters;	/* number of histogram counters  */
  };
  
  static struct function_list *functions_head = 0;
*************** static char *da_file_name;
*** 125,130 ****
--- 129,137 ----
  /* The name of the count table. Used by the edge profiling code.  */
  static GTY(()) rtx profiler_label;
  
+ /* The name of the loop histograms table.  */
+ static GTY(()) rtx loop_histograms_label;
+ 
  /* Collect statistics on the performance of this pass for the entire source
     file.  */
  
*************** static int total_num_branches;
*** 142,150 ****
--- 149,166 ----
  /* Forward declarations.  */
  static void find_spanning_tree PARAMS ((struct edge_list *));
  static rtx gen_edge_profiler PARAMS ((int));
+ static rtx gen_loop_profiler PARAMS ((rtx, int, int));
  static void instrument_edges PARAMS ((struct edge_list *));
+ static void instrument_loops PARAMS ((struct loops *));
  static void compute_branch_probabilities PARAMS ((void));
+ static void compute_loop_histograms PARAMS ((struct loops *));
+ static hashval_t htab_counts_index_hash PARAMS ((const void *));
+ static int htab_counts_index_eq PARAMS ((const void *, const void *));
+ static void htab_counts_index_del PARAMS ((void *));
+ static void cleanup_counts_index PARAMS ((int));
+ static int index_counts_file PARAMS ((void));
  static gcov_type * get_exec_counts PARAMS ((void));
+ static gcov_type * get_histogram_counts PARAMS ((int, int));
  static unsigned compute_checksum PARAMS ((void));
  static basic_block find_group PARAMS ((basic_block));
  static void union_groups PARAMS ((basic_block, basic_block));
*************** instrument_edges (el)
*** 193,259 ****
    total_num_blocks_created += num_edges;
    if (rtl_dump_file)
      fprintf (rtl_dump_file, "%d edges instrumented\n", num_instr_edges);
  
!   commit_edge_insertions_watch_calls ();
  }
  
  
! /* Computes hybrid profile for all matching entries in da_file.
!    Sets max_counter_in_program as a side effect.
!    FIXME: This is O(nfuncs^2). It should be reorganised to read the da
!    file once. */
  
! static gcov_type *
! get_exec_counts ()
  {
!   unsigned num_edges = 0;
!   basic_block bb;
!   gcov_type *profile;
!   char *function_name_buffer = NULL;
!   gcov_type max_count = 0;
!   gcov_type prog_sum_max = 0;
!   gcov_type prog_runs = 0;
!   unsigned seen_fn = 0; /* 0 = not seen fn, 1 = function now,
! 			   2 = seen fn, 3 = seen prog summaries */
!   unsigned magic, version;
!   unsigned ix;
!   const char *name = IDENTIFIER_POINTER
! 		      (DECL_ASSEMBLER_NAME (current_function_decl));
  
!   profile_info.max_counter_in_program = 0;
!   profile_info.count_profiles_merged = 0;
  
!   /* No .da file, no execution counts.  */
!   if (!da_file)
!     return 0;
  
!   /* Count the edges to be (possibly) instrumented.  */
  
!   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
      {
!       edge e;
!       for (e = bb->succ; e; e = e->succ_next)
! 	if (!EDGE_INFO (e)->ignore && !EDGE_INFO (e)->on_tree)
! 	  num_edges++;
      }
  
!   /* now read and combine all matching profiles.  */
  
-   profile = xmalloc (sizeof (gcov_type) * num_edges);
    rewind (da_file);
  
!   for (ix = 0; ix < num_edges; ix++)
!     profile[ix] = 0;
  
    if (gcov_read_unsigned (da_file, &magic) || magic != GCOV_DATA_MAGIC)
      {
        warning ("`%s' is not a gcov data file", da_file_name);
!     cleanup:;
!       fclose (da_file);
!       da_file = NULL;
!       free (profile);
!       free (function_name_buffer);
!       return 0;
      }
    if (gcov_read_unsigned (da_file, &version) || version != GCOV_VERSION)
      {
--- 209,392 ----
    total_num_blocks_created += num_edges;
    if (rtl_dump_file)
      fprintf (rtl_dump_file, "%d edges instrumented\n", num_instr_edges);
+ }
+ 
+ /* Add code that counts histograms of first iterations of LOOPS.  */
+ static void
+ instrument_loops (loops)
+      struct loops *loops;
+ {
+   enum machine_mode mode = mode_for_size (GCOV_TYPE_SIZE, MODE_INT, 0);
+   rtx *loop_counters;
+   rtx sequence;
+   int i, histogram_steps;
+   basic_block bb;
+   edge e;
+   int n_histogram_counters;
+   
+   histogram_steps = PARAM_VALUE (PARAM_MAX_PEEL_TIMES);
+   if (histogram_steps < PARAM_VALUE (PARAM_MAX_UNROLL_TIMES))
+     histogram_steps = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
+ 
+   loop_counters = xmalloc (sizeof (rtx) * loops->num);
+   for (i = 1; i < loops->num; i++)
+     loop_counters[i] = gen_reg_rtx (mode);
+ 
+   /* First the easy part -- code to initalize counter on preheader edge &
+      to increase it on latch one.  */
+   for (i = 1; i < loops->num; i++)
+     {
+       rtx tmp;
+ 
+       start_sequence ();
+       emit_move_insn (loop_counters[i], const0_rtx);
+       sequence = get_insns ();
+       end_sequence ();
+       insert_insn_on_edge (sequence, loop_preheader_edge (loops->parray[i]));
+ 
+       start_sequence ();
+       tmp = expand_simple_binop (mode, PLUS, loop_counters[i], const1_rtx,
+ 				 loop_counters[i], 0, OPTAB_WIDEN);
+       if (tmp != loop_counters[i])
+ 	emit_move_insn (loop_counters[i], tmp);
+       sequence = get_insns ();
+       end_sequence ();
+       insert_insn_on_edge (sequence, loop_latch_edge (loops->parray[i]));
+     }
+  
+   /* And now emit code to generate the histogram on exit edges. The trouble
+      is that there may be more than one edge leaving the loop and the single
+      edge may exit multiple loops.  The other problem is that the exit edge
+      may be abnormal & critical; in this case we just ignore it.  */
  
!   FOR_EACH_BB (bb)
!     {
!       for (e = bb->succ; e; e = e->succ_next)
! 	{
! 	  struct loop *src_loop, *dest_loop, *loop;
! 
! 	  if ((e->flags & EDGE_ABNORMAL) && EDGE_CRITICAL_P (e))
! 	    continue;
! 
! 	  src_loop = e->src->loop_father;
! 	  dest_loop = find_common_loop (src_loop, e->dest->loop_father);
! 
! 	  for (loop = src_loop; loop != dest_loop; loop = loop->outer)
! 	    insert_insn_on_edge (gen_loop_profiler (loop_counters[loop->num],
! 						    loop->num,
! 						    histogram_steps),
! 				 e);
! 	}
!     }
!   free (loop_counters);
! 
!   n_histogram_counters = (loops->num - 1) * (histogram_steps + 1);
!   profile_info.count_histogram_counters_now = n_histogram_counters;
!   profile_info.count_histogram_counters += n_histogram_counters;
  }
  
+ struct section_reference
+ {
+   long offset;
+   int owns_summary;
+   long *summary;
+ };
  
! struct da_index_entry
! {
!   /* We hash by  */
!   char *function_name;
!   unsigned section;
!   /* and store  */
!   unsigned checksum;
!   unsigned n_offsets;
!   struct section_reference *offsets;
! };
  
! static hashval_t
! htab_counts_index_hash (of)
!      const void *of;
  {
!   const struct da_index_entry *entry = of;
  
!   return htab_hash_string (entry->function_name) ^ entry->section;
! }
  
! static int
! htab_counts_index_eq (of1, of2)
!      const void *of1;
!      const void *of2;
! {
!   const struct da_index_entry *entry1 = of1;
!   const struct da_index_entry *entry2 = of2;
  
!   return !strcmp (entry1->function_name, entry2->function_name)
! 	  && entry1->section == entry2->section;
! }
  
! static void
! htab_counts_index_del (what)
!      void *what;
! {
!   struct da_index_entry *entry = what;
!   unsigned i;
! 
!   for (i = 0; i < entry->n_offsets; i++)
      {
!       struct section_reference *act = entry->offsets + i;
!       if (act->owns_summary)
! 	free (act->summary);
!     }
!   free (entry->function_name);
!   free (entry->offsets);
!   free (entry);
! }
! 
! static char *counts_file_name;
! static htab_t counts_file_index = NULL;
! 
! static void
! cleanup_counts_index (close_file)
!      int close_file;
! {
!   if (da_file && close_file)
!     {
!       fclose (da_file);
!       da_file = NULL;
      }
+   if (counts_file_name)
+     free (counts_file_name);
+   counts_file_name = NULL;
+   if (counts_file_index)
+     htab_delete (counts_file_index);
+   counts_file_index = NULL;
+ }
  
! static int
! index_counts_file ()
! {
!   char *function_name_buffer = NULL;
!   unsigned magic, version, ix, checksum;
!   long *summary;
! 
!   if (!da_file)
!     return 0;
!   counts_file_index = htab_create (10, htab_counts_index_hash, htab_counts_index_eq, htab_counts_index_del);
! 
!   /* No .da file, no data.  */
!   if (!da_file)
!     return 0;
! 
!   /* Now index all profile sections.  */
  
    rewind (da_file);
  
!   summary = NULL;
  
    if (gcov_read_unsigned (da_file, &magic) || magic != GCOV_DATA_MAGIC)
      {
        warning ("`%s' is not a gcov data file", da_file_name);
!       goto cleanup;
      }
    if (gcov_read_unsigned (da_file, &version) || version != GCOV_VERSION)
      {
*************** get_exec_counts ()
*** 273,280 ****
    while (1)
      {
        unsigned tag, length;
!       long base;
        
        if (gcov_read_unsigned (da_file, &tag)
  	  || gcov_read_unsigned (da_file, &length))
  	{
--- 406,414 ----
    while (1)
      {
        unsigned tag, length;
!       long offset;
        
+       offset = gcov_save_position (da_file);
        if (gcov_read_unsigned (da_file, &tag)
  	  || gcov_read_unsigned (da_file, &length))
  	{
*************** get_exec_counts ()
*** 284,374 ****
  	  warning ("`%s' is corrupted", da_file_name);
  	  goto cleanup;
  	}
-       base = gcov_save_position (da_file);
        if (tag == GCOV_TAG_FUNCTION)
  	{
- 	  unsigned checksum;
- 
- 	  if (seen_fn == 3)
- 	    {
- 	      profile_info.count_profiles_merged += prog_runs;
- 	      profile_info.max_counter_in_program += prog_sum_max;
- 	      if (!prog_runs)
- 		{
- 		  profile_info.count_profiles_merged++;
- 		  profile_info.max_counter_in_program += max_count;
- 		}
- 	      seen_fn = 0;
- 	    }
- 	  else if (seen_fn == 1)
- 	    seen_fn = 2;
- 	  
  	  if (gcov_read_string (da_file, &function_name_buffer, NULL)
  	      || gcov_read_unsigned (da_file, &checksum))
  	    goto corrupt;
! 	  
! 	  if (strcmp (name, function_name_buffer))
! 	    ;
! 	  else if (checksum != profile_info.current_function_cfg_checksum)
! 	    {
! 	    mismatch:;
! 	      warning ("profile mismatch for `%s'", current_function_name);
! 	      goto cleanup;
! 	    }
! 	  else
! 	    seen_fn = 1;
  	}
!       else if (tag == GCOV_TAG_PROGRAM_SUMMARY)
  	{
! 	  struct gcov_summary summary;
  
! 	  if (gcov_read_summary (da_file, &summary))
  	    goto corrupt;
  
! 	  if (seen_fn == 1)
! 	    seen_fn = 2;
! 	  if (seen_fn == 2)
! 	    seen_fn = 3;
! 	  if (seen_fn == 3)
! 	    {
! 	      prog_runs += summary.runs;
! 	      prog_sum_max += summary.arc_sum_max;
! 	    }
  	}
!       else if (tag == GCOV_TAG_ARC_COUNTS)
  	{
! 	  unsigned num = length / 8;
! 
! 	  if (seen_fn == 1 && num != num_edges)
! 	    goto mismatch;
! 	  
! 	  for (ix = 0; ix != num; ix++)
  	    {
! 	      gcov_type count;
! 
! 	      if (gcov_read_counter (da_file, &count))
! 		goto corrupt;
! 	      if (count > max_count)
! 		max_count = count;
! 	      if (seen_fn == 1)
! 		profile[ix] = count;
  	    }
  	}
!       gcov_resync (da_file, base, length);
      }
  
!   if (seen_fn == 3)
      {
!       profile_info.count_profiles_merged += prog_runs;
!       profile_info.max_counter_in_program += prog_sum_max;
!       if (!prog_runs)
  	{
  	  profile_info.count_profiles_merged++;
  	  profile_info.max_counter_in_program += max_count;
  	}
      }
-   
-   free (function_name_buffer);
  
    if (rtl_dump_file)
      {
--- 418,610 ----
  	  warning ("`%s' is corrupted", da_file_name);
  	  goto cleanup;
  	}
        if (tag == GCOV_TAG_FUNCTION)
  	{
  	  if (gcov_read_string (da_file, &function_name_buffer, NULL)
  	      || gcov_read_unsigned (da_file, &checksum))
  	    goto corrupt;
! 	  continue;
  	}
!       if (tag == GCOV_TAG_PROGRAM_SUMMARY)
  	{
! 	  if (length != GCOV_SUMMARY_LENGTH)
! 	    goto corrupt;
  
! 	  if (length != GCOV_SUMMARY_LENGTH)
  	    goto corrupt;
  
! 	  if (summary)
! 	    *summary = offset;
! 	  summary = NULL;
  	}
!       else
  	{
! 	  if (function_name_buffer)
  	    {
! 	      struct da_index_entry **slot, elt;
! 	      elt.function_name = function_name_buffer;
! 	      elt.section = tag;
! 
! 	      slot = (struct da_index_entry **)
! 		htab_find_slot (counts_file_index, &elt, INSERT);
! 	      if (*slot)
! 		{
! 		  if ((*slot)->checksum != checksum)
! 		    {
! 		      warning ("profile mismatch for `%s'", function_name_buffer);
! 		      goto cleanup;
! 		    }
! 		  (*slot)->n_offsets++;
! 		  (*slot)->offsets = xrealloc ((*slot)->offsets,
! 					       sizeof (long) * (*slot)->n_offsets);
! 		}
! 	      else
! 		{
! 		  *slot = xmalloc (sizeof (struct da_index_entry));
! 		  (*slot)->function_name = xstrdup (function_name_buffer);
! 		  (*slot)->section = tag;
! 		  (*slot)->checksum = checksum;
! 		  (*slot)->n_offsets = 1;
! 		  (*slot)->offsets = xmalloc (sizeof (long));
! 		}
! 	      (*slot)->offsets[(*slot)->n_offsets - 1].offset = offset;
! 	      if (summary)
! 		(*slot)->offsets[(*slot)->n_offsets - 1].owns_summary = 0;
! 	      else
! 		{
! 		  summary = xmalloc (sizeof (long));
! 		  *summary = -1;
! 		  (*slot)->offsets[(*slot)->n_offsets - 1].owns_summary = 1;
! 		}
! 	      (*slot)->offsets[(*slot)->n_offsets - 1].summary = summary;
  	    }
  	}
!       if (gcov_skip (da_file, length))
! 	goto corrupt;
!     }
! 
!   free (function_name_buffer);
! 
!   return 1;
! 
! cleanup:
!   cleanup_counts_index (1);
!   if (function_name_buffer)
!     free (function_name_buffer);
!   return 0;
! }
! 
! /* Computes hybrid profile for all matching entries in da_file.
!    Sets max_counter_in_program as a side effect.  */
! 
! static gcov_type *
! get_exec_counts ()
! {
!   unsigned num_edges = 0;
!   basic_block bb;
!   gcov_type *profile;
!   gcov_type max_count;
!   unsigned ix, i, tag, length, num;
!   const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (current_function_decl));
!   struct da_index_entry *entry, what;
!   struct section_reference *act;
!   gcov_type count;
!   struct gcov_summary summ;
! 
!   profile_info.max_counter_in_program = 0;
!   profile_info.count_profiles_merged = 0;
! 
!   /* No .da file, no execution counts.  */
!   if (!da_file)
!     return NULL;
!   if (!counts_file_index)
!     abort ();
! 
!   /* Count the edges to be (possibly) instrumented.  */
! 
!   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
!     {
!       edge e;
!       for (e = bb->succ; e; e = e->succ_next)
! 	if (!EDGE_INFO (e)->ignore && !EDGE_INFO (e)->on_tree)
! 	  num_edges++;
!     }
! 
!   /* now read and combine all matching profiles.  */
! 
!   profile = xmalloc (sizeof (gcov_type) * num_edges);
! 
!   for (ix = 0; ix < num_edges; ix++)
!     profile[ix] = 0;
! 
!   what.function_name = (char *) name;
!   what.section = GCOV_TAG_ARC_COUNTS;
!   entry = htab_find (counts_file_index, &what);
!   if (!entry)
!     {
!       warning ("No profile for function '%s' found.", name);
!       goto cleanup;
!     }
!   
!   if (entry->checksum != profile_info.current_function_cfg_checksum)
!     {
!       warning ("profile mismatch for `%s'", current_function_name);
!       goto cleanup;
      }
  
!   for (i = 0; i < entry->n_offsets; i++)
      {
!       act = entry->offsets + i;
! 
!       /* Read arc counters.  */
!       max_count = 0;
!       gcov_resync (da_file, act->offset, 0);
! 
!       if (gcov_read_unsigned (da_file, &tag)
! 	  || gcov_read_unsigned (da_file, &length)
! 	  || tag != GCOV_TAG_ARC_COUNTS)
! 	{
! 	  /* We have already passed through file, so any error means
! 	     something is rotten.  */
! 	  abort ();
! 	}
!       num = length / 8;
! 
!       if (num != num_edges)
! 	{
! 	  warning ("profile mismatch for `%s'", current_function_name);
! 	  goto cleanup;
! 	}
! 	  
!       for (ix = 0; ix != num; ix++)
! 	{
! 	  if (gcov_read_counter (da_file, &count))
! 	    abort ();
! 	  if (count > max_count)
! 	    max_count = count;
! 	  profile[ix] += count;
! 	}
! 
!       /* Read program summary.  */
!       if (*act->summary != -1)
! 	{
! 	  gcov_resync (da_file, *act->summary, 0);
! 	  if (gcov_read_unsigned (da_file, &tag)
! 	      || gcov_read_unsigned (da_file, &length)
! 	      || tag != GCOV_TAG_PROGRAM_SUMMARY
! 	      || gcov_read_summary (da_file, &summ))
! 	    abort ();
! 	  profile_info.count_profiles_merged += summ.runs;
! 	  profile_info.max_counter_in_program += summ.arc_sum_max;
! 	}
!       else
! 	summ.runs = 0;
!       if (!summ.runs)
  	{
  	  profile_info.count_profiles_merged++;
  	  profile_info.max_counter_in_program += max_count;
  	}
      }
  
    if (rtl_dump_file)
      {
*************** get_exec_counts ()
*** 378,385 ****
--- 614,750 ----
      }
  
    return profile;
+ 
+ cleanup:;
+   free (profile);
+   cleanup_counts_index (1);
+   return 0;
+ }
+ 
+ /* Get loop histogram counters.  */
+ static gcov_type *
+ get_histogram_counts (loops, steps)
+      int loops;
+      int steps;
+ {
+   gcov_type *profile;
+   gcov_type max_count;
+   unsigned ix, i, tag, length, num;
+   const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (current_function_decl));
+   struct da_index_entry *entry, what;
+   struct section_reference *act;
+   gcov_type count;
+   unsigned n_counters;
+ 
+   profile_info.max_counter_in_program = 0;
+   profile_info.count_profiles_merged = 0;
+ 
+   /* No .da file, no execution counts.  */
+   if (!da_file)
+     return NULL;
+   if (!counts_file_index)
+     abort ();
+ 
+   /* now read and combine all matching profiles.  */
+ 
+   n_counters = loops * (steps + 1);
+   profile = xmalloc (sizeof (gcov_type) * n_counters);
+ 
+   for (ix = 0; ix < n_counters; ix++)
+     profile[ix] = 0;
+ 
+   what.function_name = (char *) name;
+   what.section = GCOV_TAG_LOOP_HISTOGRAMS;
+   entry = htab_find (counts_file_index, &what);
+   if (!entry)
+     {
+       warning ("No profile for function '%s' found.", name);
+       goto cleanup;
+     }
+   
+   if (entry->checksum != profile_info.current_function_cfg_checksum)
+     {
+       warning ("profile mismatch for `%s'", current_function_name);
+       goto cleanup;
+     }
+ 
+   for (i = 0; i < entry->n_offsets; i++)
+     {
+       act = entry->offsets + i;
+ 
+       /* Read arc counters.  */
+       max_count = 0;
+       gcov_resync (da_file, act->offset, 0);
+ 
+       if (gcov_read_unsigned (da_file, &tag)
+ 	  || gcov_read_unsigned (da_file, &length)
+ 	  || tag != GCOV_TAG_LOOP_HISTOGRAMS)
+ 	{
+ 	  /* We have already passed through file, so any error means
+ 	     something is rotten.  */
+ 	  abort ();
+ 	}
+       num = length / 8;
+ 
+       if (num != n_counters)
+ 	{
+ 	  warning ("profile mismatch for `%s'", current_function_name);
+ 	  goto cleanup;
+ 	}
+ 	  
+       for (ix = 0; ix != num; ix++)
+ 	{
+ 	  if (gcov_read_counter (da_file, &count))
+ 	    abort ();
+ 	  if (count > max_count)
+ 	    max_count = count;
+ 	  profile[ix] += count;
+ 	}
+     }
+ 
+   return profile;
+ 
+ cleanup:;
+   free (profile);
+   cleanup_counts_index (1);
+   return 0;
  }
  
+ /* Load loop histograms from .da file.  */
+ static void
+ compute_loop_histograms (loops)
+      struct loops *loops;
+ {
+   int histogram_steps;
+   int i;
+   gcov_type *histogram_counts, *act_count;
+   
+   histogram_steps = PARAM_VALUE (PARAM_MAX_PEEL_TIMES);
+   if (histogram_steps < PARAM_VALUE (PARAM_MAX_UNROLL_TIMES))
+     histogram_steps = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
+ 
+   histogram_counts = get_histogram_counts (loops->num - 1, histogram_steps);
+   if (!histogram_counts)
+     return;
+ 
+   act_count = histogram_counts;
+   for (i = 1; i < loops->num; i++)
+     {
+       edge latch = loop_latch_edge (loops->parray[i]);
+ 
+       latch->loop_histogram = xmalloc (sizeof (struct loop_histogram));
+       latch->loop_histogram->steps = histogram_steps;
+       latch->loop_histogram->counts = xmalloc (histogram_steps * sizeof (gcov_type));
+       memcpy (latch->loop_histogram->counts, act_count,
+ 	      histogram_steps * sizeof (gcov_type));
+       latch->loop_histogram->more = act_count[histogram_steps];
+       act_count += histogram_steps + 1;
+     }
+ 
+   free (histogram_counts);
+   loops->state |= LOOPS_HAVE_HISTOGRAMS_ON_EDGES;
+   profile_info.have_loop_histograms = 1;
+ }
  
  /* Compute the branch probabilities for the various branches.
     Annotate them accordingly.  */
*************** branch_prob ()
*** 757,766 ****
--- 1122,1133 ----
    int i;
    int num_edges, ignored_edges;
    struct edge_list *el;
+   struct loops loops;
    const char *name = IDENTIFIER_POINTER
  		      (DECL_ASSEMBLER_NAME (current_function_decl));
  
    profile_info.current_function_cfg_checksum = compute_checksum ();
+   profile_info.have_loop_histograms = 0;
  
    if (rtl_dump_file)
      fprintf (rtl_dump_file, "CFG checksum is %u\n",
*************** branch_prob ()
*** 848,853 ****
--- 1215,1233 ----
  	}
      }
  
+   if (flag_loop_histograms)
+     {
+       /* Find loops and bring them into canonical shape.  */
+       flow_loops_find (&loops, LOOP_TREE);
+       create_preheaders (&loops, 0);
+       /* Release dominators -- we aren't going to need them nor update them.  */
+       if (loops.cfg.dom)
+ 	{
+ 	  free_dominance_info (loops.cfg.dom);
+ 	  loops.cfg.dom = NULL;
+ 	}
+     }
+ 
    el = create_edge_list ();
    num_edges = NUM_EDGES (el);
    alloc_aux_for_edges (sizeof (struct edge_info));
*************** branch_prob ()
*** 964,969 ****
--- 1344,1350 ----
  		    goto bbg_error;
  	        }
  	    }
+ 
  	  if (gcov_write_length (bbg_file, offset))
  	    goto bbg_error;
  	}
*************** branch_prob ()
*** 1030,1035 ****
--- 1411,1417 ----
  		  }
  		insn = NEXT_INSN (insn);
  	      }
+ 
  	    if (offset)
  	      {
  		if (gcov_write_unsigned (bbg_file, 0)
*************** branch_prob ()
*** 1047,1053 ****
      }
  
    if (flag_branch_probabilities)
!     compute_branch_probabilities ();
  
    /* For each edge not on the spanning tree, add counting code as rtl.  */
  
--- 1429,1439 ----
      }
  
    if (flag_branch_probabilities)
!     {
!       compute_branch_probabilities ();
!       if (flag_loop_histograms)
! 	compute_loop_histograms (&loops);
!     }
  
    /* For each edge not on the spanning tree, add counting code as rtl.  */
  
*************** branch_prob ()
*** 1056,1061 ****
--- 1442,1452 ----
        struct function_list *item;
        
        instrument_edges (el);
+       if (flag_loop_histograms)
+ 	instrument_loops (&loops);
+ 
+       /* Commit changes done by instrument_edges and instrument_loops.  */
+       commit_edge_insertions_watch_calls ();
        allocate_reg_info (max_reg_num (), FALSE, FALSE);
  
        /* ??? Probably should re-use the existing struct function.  */
*************** branch_prob ()
*** 1068,1083 ****
        item->name = xstrdup (name);
        item->cfg_checksum = profile_info.current_function_cfg_checksum;
        item->count_edges = profile_info.count_edges_instrumented_now;
      }
  
    remove_fake_edges ();
    /* Re-merge split basic blocks and the mess introduced by
       insert_insn_on_edge.  */
    cleanup_cfg (profile_arc_flag ? CLEANUP_EXPENSIVE : 0);
    if (rtl_dump_file)
      dump_flow_info (rtl_dump_file);
  
-   free_aux_for_edges ();
    free_edge_list (el);
  }
  
--- 1459,1481 ----
        item->name = xstrdup (name);
        item->cfg_checksum = profile_info.current_function_cfg_checksum;
        item->count_edges = profile_info.count_edges_instrumented_now;
+       item->histogram_counters = profile_info.count_histogram_counters_now;
+     }
+ 
+   if (flag_loop_histograms)
+     {
+       /* Free the loop datastructure.  */
+       flow_loops_free (&loops);
      }
  
    remove_fake_edges ();
+   free_aux_for_edges ();
    /* Re-merge split basic blocks and the mess introduced by
       insert_insn_on_edge.  */
    cleanup_cfg (profile_arc_flag ? CLEANUP_EXPENSIVE : 0);
    if (rtl_dump_file)
      dump_flow_info (rtl_dump_file);
  
    free_edge_list (el);
  }
  
*************** init_branch_prob (filename)
*** 1232,1237 ****
--- 1630,1639 ----
        if (!da_file)
  	warning ("file %s not found, execution counts assumed to be zero",
  		 da_file_name);
+       if (counts_file_index && strcmp (da_file_name, counts_file_name))
+        	cleanup_counts_index (0);
+       if (index_counts_file ())
+ 	counts_file_name = xstrdup (da_file_name);
      }
  
    if (profile_arc_flag)
*************** init_branch_prob (filename)
*** 1241,1246 ****
--- 1643,1654 ----
        
        ASM_GENERATE_INTERNAL_LABEL (buf, "LPBX", 2);
        profiler_label = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+ 
+       if (flag_loop_histograms)
+ 	{
+ 	  ASM_GENERATE_INTERNAL_LABEL (buf, "LPBX", 3);
+ 	  loop_histograms_label = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
+ 	}
      }
    
    total_num_blocks = 0;
*************** create_profiler ()
*** 1343,1349 ****
    rtx structure_address;
    int save_flag_inline_functions = flag_inline_functions;
  
!   if (!profile_info.count_instrumented_edges)
      return;
    
    string_type = build_pointer_type
--- 1751,1758 ----
    rtx structure_address;
    int save_flag_inline_functions = flag_inline_functions;
  
!   if (!profile_info.count_instrumented_edges
!       && !profile_info.count_histogram_counters)
      return;
    
    string_type = build_pointer_type
*************** create_profiler ()
*** 1408,1414 ****
      int num_nodes = 0;
      tree array_value = NULL_TREE;
      tree finfo_type, finfo_ptr_type;
!     tree name, checksum, arcs;
      
      finfo_type = (*lang_hooks.types.make_type) (RECORD_TYPE);
      name = build_decl (FIELD_DECL, NULL_TREE, string_type);
--- 1817,1823 ----
      int num_nodes = 0;
      tree array_value = NULL_TREE;
      tree finfo_type, finfo_ptr_type;
!     tree name, checksum, arcs, histogram_counters;
      
      finfo_type = (*lang_hooks.types.make_type) (RECORD_TYPE);
      name = build_decl (FIELD_DECL, NULL_TREE, string_type);
*************** create_profiler ()
*** 1416,1423 ****
      TREE_CHAIN (checksum) = name;
      arcs = build_decl (FIELD_DECL, NULL_TREE, unsigned_type_node);
      TREE_CHAIN (arcs) = checksum;
      finish_builtin_struct (finfo_type, "__function_info",
! 			   arcs, NULL_TREE);
      finfo_ptr_type = build_pointer_type
        (build_qualified_type (finfo_type, TYPE_QUAL_CONST));
      
--- 1825,1834 ----
      TREE_CHAIN (checksum) = name;
      arcs = build_decl (FIELD_DECL, NULL_TREE, unsigned_type_node);
      TREE_CHAIN (arcs) = checksum;
+     histogram_counters = build_decl (FIELD_DECL, NULL_TREE, unsigned_type_node);
+     TREE_CHAIN (histogram_counters) = arcs;
      finish_builtin_struct (finfo_type, "__function_info",
! 			   histogram_counters, NULL_TREE);
      finfo_ptr_type = build_pointer_type
        (build_qualified_type (finfo_type, TYPE_QUAL_CONST));
      
*************** create_profiler ()
*** 1440,1445 ****
--- 1851,1860 ----
  				 (unsigned_type_node,
  				  build_int_2 (item->count_edges, 0)),
  				 finfo_value);
+ 	finfo_value = tree_cons (histogram_counters, convert
+ 				 (unsigned_type_node,
+ 				  build_int_2 (item->histogram_counters, 0)),
+ 				 finfo_value);
  	array_value = tree_cons (NULL_TREE, build
  				 (CONSTRUCTOR, finfo_type, NULL_TREE,
  				  nreverse (finfo_value)), array_value);
*************** create_profiler ()
*** 1508,1513 ****
--- 1923,1963 ----
  				   .count_instrumented_edges, 0)),
  		     value);
    
+   /* histogram counters table */
+   {
+     tree counts_table = null_pointer_node;
+     
+     if (profile_info.count_histogram_counters)
+       {
+ 	tree gcov_type_array_type
+ 	  = build_array_type (gcov_type, build_index_type
+ 			      (build_int_2 (profile_info.
+ 					    count_histogram_counters - 1, 0)));
+ 	/* No values.  */
+ 	counts_table
+ 	  = build (VAR_DECL, gcov_type_array_type, NULL_TREE, NULL_TREE);
+ 	TREE_STATIC (counts_table) = 1;
+ 	DECL_NAME (counts_table) = get_identifier (XSTR (loop_histograms_label, 0));
+ 	assemble_variable (counts_table, 0, 0, 0);
+ 	counts_table = build1 (ADDR_EXPR, gcov_ptr_type, counts_table);
+       }
+     
+     field = build_decl (FIELD_DECL, NULL_TREE, gcov_ptr_type);
+     TREE_CHAIN (field) = fields;
+     fields = field;
+     value = tree_cons (fields, counts_table, value);
+   }
+   
+   /* number of histogram counters */
+   field = build_decl (FIELD_DECL, NULL_TREE, unsigned_type_node);
+   TREE_CHAIN (field) = fields;
+   fields = field;
+   value = tree_cons (fields, convert
+ 		     (unsigned_type_node,
+ 		      build_int_2 (profile_info
+ 				   .count_histogram_counters, 0)),
+ 		     value);
+   
    finish_builtin_struct (ginfo_type, "__gcov_info", fields, NULL_TREE);
    structure = build (VAR_DECL, ginfo_type, NULL_TREE, NULL_TREE);
    DECL_INITIAL (structure)
*************** gen_edge_profiler (edgeno)
*** 1591,1596 ****
--- 2041,2108 ----
    mem_ref = validize_mem (gen_rtx_MEM (mode, tmp));
  
    set_mem_alias_set (mem_ref, new_alias_set ());
+ 
+   tmp = expand_simple_binop (mode, PLUS, mem_ref, const1_rtx,
+ 			     mem_ref, 0, OPTAB_WIDEN);
+ 
+   if (tmp != mem_ref)
+     emit_move_insn (copy_rtx (mem_ref), tmp);
+ 
+   sequence = get_insns ();
+   end_sequence ();
+   return sequence;
+ }
+ 
+ /* Output instructions as RTL to increment the loop histogram counter.  */
+ 
+ static rtx
+ gen_loop_profiler (iterations, loop, steps)
+      rtx iterations;
+      int loop;
+      int steps;
+ {
+   enum machine_mode mode = mode_for_size (GCOV_TYPE_SIZE, MODE_INT, 0);
+   rtx mem_ref, tmp, tmp1, mr, lab, jump;
+   rtx sequence;
+   int base = (loop - 1) * (steps + 1);
+   rtx in_range_label = gen_label_rtx ();
+   rtx end_of_code_label = gen_label_rtx ();
+   int per_counter = GCOV_TYPE_SIZE / BITS_PER_UNIT;
+ 
+   start_sequence ();
+ 
+   mr = gen_reg_rtx (Pmode);
+ 
+   tmp = force_reg (Pmode, loop_histograms_label);
+   tmp = plus_constant (tmp,
+ 		       per_counter * (base + profile_info.count_histogram_counters));
+ 
+   do_compare_rtx_and_jump (iterations, GEN_INT (steps), GE, 0, mode, NULL_RTX,
+ 			   in_range_label, NULL_RTX);
+   jump = get_last_insn ();
+   JUMP_LABEL (jump) = in_range_label;
+ 
+   tmp1 = expand_simple_binop (Pmode, PLUS, tmp, GEN_INT (per_counter * steps), mr, 0, OPTAB_WIDEN);
+   if (tmp1 != mr)
+     emit_move_insn (mr, tmp1);
+ 
+   jump = emit_jump_insn (gen_jump (end_of_code_label));
+   JUMP_LABEL (jump) = end_of_code_label;
+   emit_barrier ();
+ 
+   lab = emit_label (in_range_label);
+   LABEL_NUSES (lab) = 1;
+ 
+   tmp1 = expand_simple_binop (mode, MULT, iterations, GEN_INT (per_counter),
+ 			      NULL_RTX, 0, OPTAB_WIDEN);
+   tmp1 = expand_simple_binop (Pmode, PLUS, tmp, tmp1, mr, 0, OPTAB_WIDEN);
+   if (tmp1 != mr)
+     emit_move_insn (mr, tmp1);
+ 
+   lab = emit_label (end_of_code_label);
+   LABEL_NUSES (lab) = 1;
+ 
+   mem_ref = validize_mem (gen_rtx_MEM (mode, mr));
  
    tmp = expand_simple_binop (mode, PLUS, mem_ref, const1_rtx,
  			     mem_ref, 0, OPTAB_WIDEN);
Index: profile.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/profile.h,v
retrieving revision 1.3.14.1
diff -c -3 -p -r1.3.14.1 profile.h
*** profile.h	16 Oct 2002 16:07:24 -0000	1.3.14.1
--- profile.h	8 Nov 2002 12:14:14 -0000
*************** struct profile_info
*** 28,38 ****
--- 28,48 ----
  
      int count_instrumented_edges;
  
+     /* Used by final, for allocating the proper amount of storage for the
+        loop histogram counters.  */
+ 
+     int count_histogram_counters;
+ 
      /* Used by final, for writing correct # of instrumented edges
         in this function.  */
  
      int count_edges_instrumented_now;
  
+     /* Used by final, for writing correct # of histogram counters
+        in this function.  */
+ 
+     int count_histogram_counters_now;
+ 
      /* Checksum of the cfg. Used for 'identification' of code.
         Used by final.  */
  
*************** struct profile_info
*** 47,52 ****
--- 57,64 ----
         function.  */
      int count_profiles_merged;
  
+     /* Do we also have loop histograms?  */
+     int have_loop_histograms;
    };
  
  extern struct profile_info profile_info;
Index: toplev.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.668.2.8
diff -c -3 -p -r1.668.2.8 toplev.c
*** toplev.c	1 Nov 2002 19:20:56 -0000	1.668.2.8
--- toplev.c	8 Nov 2002 12:14:14 -0000
*************** int flag_test_coverage = 0;
*** 389,394 ****
--- 389,397 ----
  
  int flag_branch_probabilities = 0;
  
+ /* Nonzero if generating or using histograms for loop iterations.  */
+ int flag_loop_histograms = 0;
+ 
  /* Nonzero if basic blocks should be reordered.  */
  
  int flag_reorder_blocks = 0;
*************** static const lang_independent_options f_
*** 1120,1125 ****
--- 1123,1130 ----
     N_("Create data files needed by gcov") },
    {"branch-probabilities", &flag_branch_probabilities, 1,
     N_("Use profiling information for branch probabilities") },
+   {"loop-histograms", &flag_loop_histograms, 1,
+    N_("Insert code to measure loop histograms and/or use them") },
    {"profile", &profile_flag, 1,
     N_("Enable basic program profiling code") },
    {"reorder-blocks", &flag_reorder_blocks, 1,
Index: tracer.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tracer.c,v
retrieving revision 1.3.2.2
diff -c -3 -p -r1.3.2.2 tracer.c
*** tracer.c	19 Oct 2002 15:20:25 -0000	1.3.2.2
--- tracer.c	8 Nov 2002 12:14:14 -0000
*************** find_trace (bb, trace)
*** 158,164 ****
       basic_block *trace;
  {
    int i = 0;
!   edge e;
  
    if (rtl_dump_file)
      fprintf (rtl_dump_file, "Trace seed %i [%i]", bb->index, bb->frequency);
--- 158,164 ----
       basic_block *trace;
  {
    int i = 0;
!   edge e, te;
  
    if (rtl_dump_file)
      fprintf (rtl_dump_file, "Trace seed %i [%i]", bb->index, bb->frequency);
*************** find_trace (bb, trace)
*** 169,174 ****
--- 169,178 ----
        if (seen (bb2) || (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
  	  || find_best_successor (bb2) != e)
  	break;
+       /* Stop trace from starting outside the loop.  */
+       for (te = bb2->pred; te; te = te->pred_next)
+ 	if (te->flags & EDGE_DFS_BACK)
+ 	  break;
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ",%i [%i]", bb->index, bb->frequency);
        bb = bb2;
*************** find_trace (bb, trace)
*** 184,189 ****
--- 188,200 ----
        if (seen (bb) || (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
  	  || find_best_predecessor (bb) != e)
  	break;
+       /* Prevent loop peeling -- stop trace at loop header.  */
+       for (te = bb->pred; te; te = te->pred_next)
+ 	if (te->flags & EDGE_DFS_BACK)
+ 	  break;
+       if (te)
+ 	break;
+ 
        if (rtl_dump_file)
  	fprintf (rtl_dump_file, ",%i [%i]", bb->index, bb->frequency);
        trace[i++] = bb;
Index: invoke.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.174.2.2
diff -c -3 -p -r1.174.2.2 invoke.texi
*** invoke.texi	23 Oct 2002 15:48:55 -0000	1.174.2.2
--- invoke.texi	7 Nov 2002 21:35:31 -0000
*************** in the following sections.
*** 265,271 ****
  -falign-functions=@var{n}  -falign-jumps=@var{n} @gol
  -falign-labels=@var{n}  -falign-loops=@var{n}  @gol
  -fbounds-check @gol
! -fbranch-probabilities  -fcaller-saves -fcprop-registers @gol
  -fcse-follow-jumps  -fcse-skip-blocks  -fdata-sections @gol
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
--- 265,271 ----
  -falign-functions=@var{n}  -falign-jumps=@var{n} @gol
  -falign-labels=@var{n}  -falign-loops=@var{n}  @gol
  -fbounds-check @gol
! -fbranch-probabilities -floop-histograms -fcaller-saves -fcprop-registers @gol
  -fcse-follow-jumps  -fcse-skip-blocks  -fdata-sections @gol
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
*************** generate the arc profile information by 
*** 2989,2995 ****
  selected workload, and then compile the program again with the same
  optimization and code generation options plus
  @option{-fbranch-probabilities} (@pxref{Optimize Options,,Options that
! Control Optimization}).
  
  The other use of @option{-fprofile-arcs} is for use with @code{gcov},
  when it is used with the @option{-ftest-coverage} option.
--- 2989,2997 ----
  selected workload, and then compile the program again with the same
  optimization and code generation options plus
  @option{-fbranch-probabilities} (@pxref{Optimize Options,,Options that
! Control Optimization}). You may also add @option{-floop-histograms}
! to both of the compilations to measure histograms of iterations of
! loops. This information is useful for loop optimizer.
  
  The other use of @option{-fprofile-arcs} is for use with @code{gcov},
  when it is used with the @option{-ftest-coverage} option.
*************** These can be used to improve optimizatio
*** 3970,3975 ****
--- 3972,3984 ----
  used in one place: in @file{reorg.c}, instead of guessing which path a
  branch is mostly to take, the @samp{REG_BR_PROB} values are used to
  exactly determine which path is taken more often.
+ 
+ @item -floop-histograms
+ @opindex floop-histograms
+ Add this option to @option{-fprofile-arcs} to also add code to measure
+ histograms of loop iterations.  By adding it to @option{-fbranch-probabilities}
+ you instruct compiler to read back these histograms and use them in loop
+ optimizer.
  
  @item -fno-guess-branch-probability
  @opindex fno-guess-branch-probability


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]