This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Update passes to use optimize_*_for_size_p tests

From: Jan Hubicka <jh at suse dot cz>
To: gcc-patches at gcc dot gnu dot org
Date: Fri, 29 Aug 2008 01:30:53 +0200
Subject: Update passes to use optimize_*_for_size_p tests
Hi,
this patch updates current tests for optimize_size and maybe_hot_bb* for
optimize*_for_size_p.  I've also added optimize_loop_for_size_p as there seems
to be some confussion in loop optimizer on how to identify loops to be
optimized aggressivly and removed from opts.c code disabling passes on
optimize_size that now can work that out from profile.

Bootstrapped/regtested i686-linux, will commit it tomorrow if there are no complains.

Honza

	* loop-unswitch.c (unswitch_single_loop): Use optimize_loop_for_speed_p.
	* tree-ssa-threadupdate.c (mark_threaded_blocks): Use optimize_function_for_size_p.
	* tracer.c (ignore_bb_p): Use optimize_bb_for_size_p.
	* postreload-gcse.c (eliminate_partially_redundant_load): Use optimize_bb_for_size_p.
	* value-prof.c (gimple_divmod_fixed_value_transform,
	gimple_mod_pow2_value_transform, gimple_mod_subtract_transform,
	gimple_stringops_transform): Use optimize_bb_for_size_p.
	* tree-ssa-loop-ch. (should_duplicate_loop_header_p): Skip cold loops.
	* ipa-cp.c (ipcp_insert_stage): Use optimize_function_for_size_p.
	* final.c (compute_alignments): Use optimize_function_for_size_p.
	* builtins.c (fold_builtin_cabs): Use optimize_function_for_speed_p.
	(fold_builtin_strcpy, fold_builtin_fputs): Use
	optimize_function_for_size_p.
	* fold-const.c (tree_swap_operands_p): Use optimize_function_for_size_p.
	* recog.c (relax_delay_slots): Likewise.
	* tree-ssa-math-opts.c (replace_reciprocal): Use optimize_bb_for_speed_p.
	(execute_cse_reciprocals): Use optimize_bb_for_size_p.
	* ipa-inline.c (cgraph_decide_recursive_inlining): Use
	optimize_function_for_size_p.
	(cgraph_decide_inlining_of_small_function): Use
	optimize_function_for_size_p.
	* global.c (find_reg): Use optimize_function_for_size_p.
	* opts.c (decode_options): Do not clear flag_tree_ch, flag_inline_functions,
	flag_unswitch_loops, flag_unroll_loops, flag_unroll_all_loops and
	flag_prefetch_loop_arrays. Those can work it out from profile.
	* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely): Use
	optimize_loop_for_speed_p.
	* predict.c (optimize_bb_for_size_p, optimize_bb_for_speed_p): Constify
	argument.
	(optimize_loop_for_size_p, optimize_loop_for_speed_p): New.
	* tree-parloops.c (parallelize_loops): Use optimize_loop_for_size_p.
	* tree-eh.c (decide_copy_try_finally): Use optimize_function_for_size_p.
	* local-alloc.c (block_alloc): Pass BB pointer.
	(find_free_reg): Add BB pointer, use optimize_bb_for_size_p.
	* gcse.c (gcse_main): Use optimize_function_for_size_p.
	* loop-unroll.c (decide_unrolling_and_peeling): Use optimize_loop_for_size_p.
	(decide_peel_completely): Likewise.
	* tree-vect-analyze.c (vect_mark_for_runtime_alias_test): Use
	optimize_loop_for_size_p.
	(vect_enhance_data_refs_alignment): Likewise.
	* tree-ssa-coalesce.c (coalesce_cost): Add optimize_for_size argument.
	(coalesce_cost_bb, coalesce_cost_edge, create_outofssa_var_map): Update call.
	* cfgcleanup.c (outgoing_edges_match): Use optimize_bb_for_speed_p.
	(try_crossjump_bb): Use optimize_bb_for_size_p.
	* tree-ssa-loop-prefetch.c (loop_prefetch_arrays): Use
	optimize_loop_for_speed_p.
	* bb-reorder.c (find_traces_1_round): Likewise.
	(copy_bb): Use optimize_bb_for_speed_p.
	(duplicate_computed_gotos): Likewise.
	* basic-block.h (optimize_bb_for_size_p, optimize_bb_for_speed_p): Constify.
	(optimize_loop_for_size_p, optimize_loop_for_speed_p): New.
	* stmt.c (expand_case): Use optimize_insn_for_size_p.
Index: loop-unswitch.c
===================================================================
*** loop-unswitch.c	(revision 139737)
--- loop-unswitch.c	(working copy)
*************** unswitch_single_loop (struct loop *loop,
*** 290,296 ****
      }
  
    /* Do not unswitch in cold areas.  */
!   if (!maybe_hot_bb_p (loop->header))
      {
        if (dump_file)
  	fprintf (dump_file, ";; Not unswitching, not hot area\n");
--- 290,296 ----
      }
  
    /* Do not unswitch in cold areas.  */
!   if (optimize_loop_for_speed_p (loop))
      {
        if (dump_file)
  	fprintf (dump_file, ";; Not unswitching, not hot area\n");
Index: tree-ssa-threadupdate.c
===================================================================
*** tree-ssa-threadupdate.c	(revision 139737)
--- tree-ssa-threadupdate.c	(working copy)
*************** mark_threaded_blocks (bitmap threaded_bl
*** 994,1000 ****
  
    /* If optimizing for size, only thread through block if we don't have
       to duplicate it or it's an otherwise empty redirection block.  */
!   if (optimize_size)
      {
        EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
  	{
--- 994,1000 ----
  
    /* If optimizing for size, only thread through block if we don't have
       to duplicate it or it's an otherwise empty redirection block.  */
!   if (optimize_function_for_size_p (cfun))
      {
        EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
  	{
Index: tracer.c
===================================================================
*** tracer.c	(revision 139737)
--- tracer.c	(working copy)
*************** ignore_bb_p (const_basic_block bb)
*** 92,98 ****
  {
    if (bb->index < NUM_FIXED_BLOCKS)
      return true;
!   if (!maybe_hot_bb_p (bb))
      return true;
    return false;
  }
--- 92,98 ----
  {
    if (bb->index < NUM_FIXED_BLOCKS)
      return true;
!   if (optimize_bb_for_size_p (bb))
      return true;
    return false;
  }
Index: postreload-gcse.c
===================================================================
*** postreload-gcse.c	(revision 139737)
--- postreload-gcse.c	(working copy)
*************** eliminate_partially_redundant_load (basi
*** 1066,1072 ****
    if (/* No load can be replaced by copy.  */
        npred_ok == 0
        /* Prevent exploding the code.  */ 
!       || (optimize_size && npred_ok > 1)
        /* If we don't have profile information we cannot tell if splitting 
           a critical edge is profitable or not so don't do it.  */
        || ((! profile_info || ! flag_branch_probabilities
--- 1066,1072 ----
    if (/* No load can be replaced by copy.  */
        npred_ok == 0
        /* Prevent exploding the code.  */ 
!       || (optimize_bb_for_size_p (bb) && npred_ok > 1)
        /* If we don't have profile information we cannot tell if splitting 
           a critical edge is profitable or not so don't do it.  */
        || ((! profile_info || ! flag_branch_probabilities
Index: value-prof.c
===================================================================
*** value-prof.c	(revision 139737)
--- value-prof.c	(working copy)
*************** gimple_divmod_fixed_value_transform (gim
*** 669,675 ****
       at least 50% of time (and 75% gives the guarantee of usage).  */
    if (simple_cst_equal (gimple_assign_rhs2 (stmt), value) != 1
        || 2 * count < all
!       || !maybe_hot_bb_p (gimple_bb (stmt)))
      return false;
  
    if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
--- 669,675 ----
       at least 50% of time (and 75% gives the guarantee of usage).  */
    if (simple_cst_equal (gimple_assign_rhs2 (stmt), value) != 1
        || 2 * count < all
!       || optimize_bb_for_size_p (gimple_bb (stmt)))
      return false;
  
    if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
*************** gimple_mod_pow2_value_transform (gimple_
*** 820,826 ****
    /* We require that we hit a power of 2 at least half of all evaluations.  */
    if (simple_cst_equal (gimple_assign_rhs2 (stmt), value) != 1
        || count < wrong_values
!       || !maybe_hot_bb_p (gimple_bb (stmt)))
      return false;
  
    if (dump_file)
--- 820,826 ----
    /* We require that we hit a power of 2 at least half of all evaluations.  */
    if (simple_cst_equal (gimple_assign_rhs2 (stmt), value) != 1
        || count < wrong_values
!       || optimize_bb_for_size_p (gimple_bb (stmt)))
      return false;
  
    if (dump_file)
*************** gimple_mod_subtract_transform (gimple_st
*** 1017,1023 ****
  	break;
      }
    if (i == steps
!       || !maybe_hot_bb_p (gimple_bb (stmt)))
      return false;
  
    gimple_remove_histogram_value (cfun, stmt, histogram);
--- 1017,1023 ----
  	break;
      }
    if (i == steps
!       || optimize_bb_for_size_p (gimple_bb (stmt)))
      return false;
  
    gimple_remove_histogram_value (cfun, stmt, histogram);
*************** gimple_stringops_transform (gimple_stmt_
*** 1397,1403 ****
    /* We require that count is at least half of all; this means
       that for the transformation to fire the value must be constant
       at least 80% of time.  */
!   if ((6 * count / 5) < all || !maybe_hot_bb_p (gimple_bb (stmt)))
      return false;
    if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
      return false;
--- 1397,1403 ----
    /* We require that count is at least half of all; this means
       that for the transformation to fire the value must be constant
       at least 80% of time.  */
!   if ((6 * count / 5) < all || optimize_bb_for_size_p (gimple_bb (stmt)))
      return false;
    if (check_counter (stmt, "value", &count, &all, gimple_bb (stmt)->count))
      return false;
Index: tree-ssa-loop-ch.c
===================================================================
*** tree-ssa-loop-ch.c	(revision 139737)
--- tree-ssa-loop-ch.c	(working copy)
*************** should_duplicate_loop_header_p (basic_bl
*** 58,63 ****
--- 58,70 ----
    if (header->aux)
      return false;
  
+   /* Loop header copying usually increases size of the code.  This used not to
+      be true, since quite often it is possible to verify that the condition is
+      satisfied in the first iteration and therefore to eliminate it.  Jump
+      threading handles these cases now.  */
+   if (optimize_loop_for_size_p (loop))
+     return false;
+ 
    gcc_assert (EDGE_COUNT (header->succs) > 0);
    if (single_succ_p (header))
      return false;
Index: ipa-cp.c
===================================================================
*** ipa-cp.c	(revision 139737)
--- ipa-cp.c	(working copy)
*************** ipcp_insert_stage (void)
*** 1019,1027 ****
        if (new_insns + growth > max_new_insns)
  	break;
        if (growth
!           && (optimize_size
! 	      || (DECL_STRUCT_FUNCTION (node->decl)
! 	          ->function_frequency == FUNCTION_FREQUENCY_UNLIKELY_EXECUTED)))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, "Not versioning, cold code would grow");
--- 1019,1025 ----
        if (new_insns + growth > max_new_insns)
  	break;
        if (growth
! 	  && optimize_function_for_size_p (DECL_STRUCT_FUNCTION (node->decl)))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, "Not versioning, cold code would grow");
Index: final.c
===================================================================
*** final.c	(revision 139737)
--- final.c	(working copy)
*************** compute_alignments (void)
*** 683,689 ****
    label_align = XCNEWVEC (struct label_alignment, max_labelno - min_labelno + 1);
  
    /* If not optimizing or optimizing for size, don't assign any alignments.  */
!   if (! optimize || optimize_size)
      return 0;
  
    if (dump_file)
--- 683,689 ----
    label_align = XCNEWVEC (struct label_alignment, max_labelno - min_labelno + 1);
  
    /* If not optimizing or optimizing for size, don't assign any alignments.  */
!   if (! optimize || optimize_function_for_size_p (cfun))
      return 0;
  
    if (dump_file)
*************** compute_alignments (void)
*** 765,771 ****
        /* In case block is frequent and reached mostly by non-fallthru edge,
  	 align it.  It is most likely a first block of loop.  */
        if (has_fallthru
! 	  && maybe_hot_bb_p (bb)
  	  && branch_frequency + fallthru_frequency > freq_threshold
  	  && (branch_frequency
  	      > fallthru_frequency * PARAM_VALUE (PARAM_ALIGN_LOOP_ITERATIONS)))
--- 765,771 ----
        /* In case block is frequent and reached mostly by non-fallthru edge,
  	 align it.  It is most likely a first block of loop.  */
        if (has_fallthru
! 	  && optimize_bb_for_speed_p (bb)
  	  && branch_frequency + fallthru_frequency > freq_threshold
  	  && (branch_frequency
  	      > fallthru_frequency * PARAM_VALUE (PARAM_ALIGN_LOOP_ITERATIONS)))
Index: builtins.c
===================================================================
*** builtins.c	(revision 139737)
--- builtins.c	(working copy)
*************** fold_builtin_cabs (tree arg, tree type, 
*** 7530,7536 ****
  
    /* Don't do this when optimizing for size.  */
    if (flag_unsafe_math_optimizations
!       && optimize && !optimize_size)
      {
        tree sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
  
--- 7530,7536 ----
  
    /* Don't do this when optimizing for size.  */
    if (flag_unsafe_math_optimizations
!       && optimize && optimize_function_for_speed_p (cfun))
      {
        tree sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
  
*************** fold_builtin_strcpy (tree fndecl, tree d
*** 8882,8888 ****
    if (operand_equal_p (src, dest, 0))
      return fold_convert (TREE_TYPE (TREE_TYPE (fndecl)), dest);
  
!   if (optimize_size)
      return NULL_TREE;
  
    fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
--- 8882,8888 ----
    if (operand_equal_p (src, dest, 0))
      return fold_convert (TREE_TYPE (TREE_TYPE (fndecl)), dest);
  
!   if (optimize_function_for_size_p (cfun))
      return NULL_TREE;
  
    fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
*************** fold_builtin_fputs (tree arg0, tree arg1
*** 11501,11507 ****
      case 1: /* length is greater than 1, call fwrite.  */
        {
  	/* If optimizing for size keep fputs.  */
! 	if (optimize_size)
  	  return NULL_TREE;
  	/* New argument list transforming fputs(string, stream) to
  	   fwrite(string, 1, len, stream).  */
--- 11501,11507 ----
      case 1: /* length is greater than 1, call fwrite.  */
        {
  	/* If optimizing for size keep fputs.  */
! 	if (optimize_function_for_size_p (cfun))
  	  return NULL_TREE;
  	/* New argument list transforming fputs(string, stream) to
  	   fwrite(string, 1, len, stream).  */
Index: fold-const.c
===================================================================
*** fold-const.c	(revision 139737)
--- fold-const.c	(working copy)
*************** tree_swap_operands_p (const_tree arg0, c
*** 6679,6685 ****
    if (TREE_CONSTANT (arg0))
      return 1;
  
!   if (optimize_size)
      return 0;
  
    if (reorder && flag_evaluation_order
--- 6679,6685 ----
    if (TREE_CONSTANT (arg0))
      return 1;
  
!   if (cfun && optimize_function_for_size_p (cfun))
      return 0;
  
    if (reorder && flag_evaluation_order
*************** fold_binary (enum tree_code code, tree t
*** 10407,10413 ****
  		}
  
  	      /* Optimize x*x as pow(x,2.0), which is expanded as x*x.  */
! 	      if (! optimize_size
  		  && operand_equal_p (arg0, arg1, 0))
  		{
  		  tree powfn = mathfn_built_in (type, BUILT_IN_POW);
--- 10407,10413 ----
  		}
  
  	      /* Optimize x*x as pow(x,2.0), which is expanded as x*x.  */
! 	      if (optimize_function_for_speed_p (cfun)
  		  && operand_equal_p (arg0, arg1, 0))
  		{
  		  tree powfn = mathfn_built_in (type, BUILT_IN_POW);
iNDEX: reorg.c
===================================================================
*** reorg.c	(revision 139737)
--- reorg.c	(working copy)
*************** relax_delay_slots (rtx first)
*** 3439,3445 ****
  
  	 Only do so if optimizing for size since this results in slower, but
  	 smaller code.  */
!       if (optimize_size
  	  && GET_CODE (PATTERN (delay_insn)) == RETURN
  	  && next
  	  && JUMP_P (next)
--- 3439,3445 ----
  
  	 Only do so if optimizing for size since this results in slower, but
  	 smaller code.  */
!       if (optimize_function_for_size_p (cfun)
  	  && GET_CODE (PATTERN (delay_insn)) == RETURN
  	  && next
  	  && JUMP_P (next)
Index: tree-ssa-math-opts.c
===================================================================
*** tree-ssa-math-opts.c	(revision 139737)
--- tree-ssa-math-opts.c	(working copy)
*************** replace_reciprocal (use_operand_p use_p)
*** 353,359 ****
    basic_block bb = gimple_bb (use_stmt);
    struct occurrence *occ = (struct occurrence *) bb->aux;
  
!   if (occ->recip_def && use_stmt != occ->recip_def_stmt)
      {
        gimple_assign_set_rhs_code (use_stmt, MULT_EXPR);
        SET_USE (use_p, occ->recip_def);
--- 353,360 ----
    basic_block bb = gimple_bb (use_stmt);
    struct occurrence *occ = (struct occurrence *) bb->aux;
  
!   if (optimize_bb_for_speed_p (bb)
!       && occ->recip_def && use_stmt != occ->recip_def_stmt)
      {
        gimple_assign_set_rhs_code (use_stmt, MULT_EXPR);
        SET_USE (use_p, occ->recip_def);
*************** execute_cse_reciprocals_1 (gimple_stmt_i
*** 445,451 ****
  static bool
  gate_cse_reciprocals (void)
  {
!   return optimize && !optimize_size && flag_reciprocal_math;
  }
  
  /* Go through all the floating-point SSA_NAMEs, and call
--- 446,452 ----
  static bool
  gate_cse_reciprocals (void)
  {
!   return optimize && flag_reciprocal_math;
  }
  
  /* Go through all the floating-point SSA_NAMEs, and call
*************** execute_cse_reciprocals (void)
*** 500,505 ****
--- 501,509 ----
  	    execute_cse_reciprocals_1 (&gsi, def);
  	}
  
+       if (optimize_bb_for_size_p (bb))
+         continue;
+ 
        /* Scan for a/func(b) and convert it to reciprocal a*rfunc(b).  */
        for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
          {
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 139737)
--- ipa-inline.c	(working copy)
*************** cgraph_decide_recursive_inlining (struct
*** 674,680 ****
    int depth = 0;
    int n = 0;
  
!   if (optimize_size
        || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
      return false;
  
--- 674,680 ----
    int depth = 0;
    int n = 0;
  
!   if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION (node->decl))
        || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
      return false;
  
*************** cgraph_decide_inlining_of_small_function
*** 951,957 ****
        if (!flag_inline_functions
  	  && !DECL_DECLARED_INLINE_P (edge->callee->decl))
   	not_good = N_("function not declared inline and code size would grow");
!       if (optimize_size)
   	not_good = N_("optimizing for size and code size would grow");
        if (not_good && growth > 0 && cgraph_estimate_growth (edge->callee) > 0)
  	{
--- 951,957 ----
        if (!flag_inline_functions
  	  && !DECL_DECLARED_INLINE_P (edge->callee->decl))
   	not_good = N_("function not declared inline and code size would grow");
!       if (optimize_function_for_size_p (DECL_STRUCT_FUNCTION(edge->caller->decl)))
   	not_good = N_("optimizing for size and code size would grow");
        if (not_good && growth > 0 && cgraph_estimate_growth (edge->callee) > 0)
  	{
Index: global.c
===================================================================
*** global.c	(revision 139737)
--- global.c	(working copy)
*************** find_reg (int num, HARD_REG_SET losers, 
*** 1168,1175 ****
        if (! accept_call_clobbered
  	  && allocno[num].calls_crossed != 0
  	  && allocno[num].throwing_calls_crossed == 0
! 	  && CALLER_SAVE_PROFITABLE (optimize_size ? allocno[num].n_refs : allocno[num].freq,
! 				     optimize_size ? allocno[num].calls_crossed
  				     : allocno[num].freq_calls_crossed))
  	{
  	  HARD_REG_SET new_losers;
--- 1168,1175 ----
        if (! accept_call_clobbered
  	  && allocno[num].calls_crossed != 0
  	  && allocno[num].throwing_calls_crossed == 0
! 	  && CALLER_SAVE_PROFITABLE (optimize_function_for_size_p (cfun) ? allocno[num].n_refs : allocno[num].freq,
! 				     optimize_function_for_size_p (cfun) ? allocno[num].calls_crossed
  				     : allocno[num].freq_calls_crossed))
  	{
  	  HARD_REG_SET new_losers;
Index: opts.c
===================================================================
*** opts.c	(revision 139737)
--- opts.c	(working copy)
*************** decode_options (unsigned int argc, const
*** 990,1001 ****
  
    if (optimize_size)
      {
-       /* Loop header copying usually increases size of the code.  This used not to
- 	 be true, since quite often it is possible to verify that the condition is
- 	 satisfied in the first iteration and therefore to eliminate it.  Jump
- 	 threading handles these cases now.  */
-       flag_tree_ch = 0;
- 
        /* Conditional DCE generates bigger code.  */
        flag_tree_builtin_call_dce = 0;
  
--- 990,995 ----
*************** decode_options (unsigned int argc, const
*** 1004,1011 ****
  
        /* These options are set with -O3, so reset for -Os */
        flag_predictive_commoning = 0;
-       flag_inline_functions = 0;
-       flag_unswitch_loops = 0;
        flag_gcse_after_reload = 0;
        flag_tree_vectorize = 0;
  
--- 998,1003 ----
*************** decode_options (unsigned int argc, const
*** 1029,1040 ****
        align_labels = 1;
        align_functions = 1;
  
-       /* Unroll/prefetch switches that may be set on the command line, and tend to
- 	 generate bigger code.  */
-       flag_unroll_loops = 0;
-       flag_unroll_all_loops = 0;
-       flag_prefetch_loop_arrays = 0;
- 
        /* Basic optimization options.  */
        optimize_size = 1;
        if (optimize > 2)
--- 1021,1026 ----
Index: tree-ssa-loop-ivcanon.c
===================================================================
*** tree-ssa-loop-ivcanon.c	(revision 139737)
--- tree-ssa-loop-ivcanon.c	(working copy)
*************** tree_unroll_loops_completely (bool may_i
*** 359,365 ****
  
        FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
  	{
! 	  if (may_increase_size && maybe_hot_bb_p (loop->header)
  	      /* Unroll outermost loops only if asked to do so or they do
  		 not cause code growth.  */
  	      && (unroll_outer
--- 359,365 ----
  
        FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
  	{
! 	  if (may_increase_size && optimize_loop_for_speed_p (loop)
  	      /* Unroll outermost loops only if asked to do so or they do
  		 not cause code growth.  */
  	      && (unroll_outer
Index: predict.c
===================================================================
*** predict.c	(revision 139737)
--- predict.c	(working copy)
*************** optimize_function_for_speed_p (struct fu
*** 200,206 ****
  /* Return TRUE when BB should be optimized for size.  */
  
  bool
! optimize_bb_for_size_p (basic_block bb)
  {
    return optimize_function_for_size_p (cfun) || !maybe_hot_bb_p (bb);
  }
--- 200,206 ----
  /* Return TRUE when BB should be optimized for size.  */
  
  bool
! optimize_bb_for_size_p (const_basic_block bb)
  {
    return optimize_function_for_size_p (cfun) || !maybe_hot_bb_p (bb);
  }
*************** optimize_bb_for_size_p (basic_block bb)
*** 208,214 ****
  /* Return TRUE when BB should be optimized for speed.  */
  
  bool
! optimize_bb_for_speed_p (basic_block bb)
  {
    return !optimize_bb_for_size_p (bb);
  }
--- 208,214 ----
  /* Return TRUE when BB should be optimized for speed.  */
  
  bool
! optimize_bb_for_speed_p (const_basic_block bb)
  {
    return !optimize_bb_for_size_p (bb);
  }
*************** optimize_insn_for_speed_p (void)
*** 245,250 ****
--- 245,266 ----
    return !optimize_insn_for_size_p ();
  }
  
+ /* Return TRUE when LOOP should be optimized for size.  */
+ 
+ bool
+ optimize_loop_for_size_p (struct loop *loop)
+ {
+   return optimize_bb_for_size_p (loop->header);
+ }
+ 
+ /* Return TRUE when LOOP should be optimized for speed.  */
+ 
+ bool
+ optimize_loop_for_speed_p (struct loop *loop)
+ {
+   return optimize_bb_for_speed_p (loop->header);
+ }
+ 
  /* Set RTL expansion for BB profile.  */
  
  void
Index: tree-parloops.c
===================================================================
*** tree-parloops.c	(revision 139737)
--- tree-parloops.c	(working copy)
*************** parallelize_loops (void)
*** 1843,1849 ****
      {
        htab_empty (reduction_list);
        if (/* Do not bother with loops in cold areas.  */
! 	  !maybe_hot_bb_p (loop->header)
  	  /* Or loops that roll too little.  */
  	  || expected_loop_iterations (loop) <= n_threads
  	  /* And of course, the loop must be parallelizable.  */
--- 1843,1849 ----
      {
        htab_empty (reduction_list);
        if (/* Do not bother with loops in cold areas.  */
! 	  optimize_loop_for_size_p (loop)
  	  /* Or loops that roll too little.  */
  	  || expected_loop_iterations (loop) <= n_threads
  	  /* And of course, the loop must be parallelizable.  */
Index: tree-eh.c
===================================================================
*** tree-eh.c	(revision 139737)
--- tree-eh.c	(working copy)
*************** decide_copy_try_finally (int ndests, gim
*** 1535,1541 ****
    sw_estimate = 10 + 2 * ndests;
  
    /* Optimize for size clearly wants our best guess.  */
!   if (optimize_size)
      return f_estimate < sw_estimate;
  
    /* ??? These numbers are completely made up so far.  */
--- 1535,1541 ----
    sw_estimate = 10 + 2 * ndests;
  
    /* Optimize for size clearly wants our best guess.  */
!   if (optimize_function_for_size_p (cfun))
      return f_estimate < sw_estimate;
  
    /* ??? These numbers are completely made up so far.  */
Index: local-alloc.c
===================================================================
*** local-alloc.c	(revision 139737)
--- local-alloc.c	(working copy)
*************** static int contains_replace_regs (rtx);
*** 299,305 ****
  static int memref_referenced_p (rtx, rtx);
  static int memref_used_between_p (rtx, rtx, rtx);
  static void no_equiv (rtx, const_rtx, void *);
! static void block_alloc (int);
  static int qty_sugg_compare (int, int);
  static int qty_sugg_compare_1 (const void *, const void *);
  static int qty_compare (int, int);
--- 299,305 ----
  static int memref_referenced_p (rtx, rtx);
  static int memref_used_between_p (rtx, rtx, rtx);
  static void no_equiv (rtx, const_rtx, void *);
! static void block_alloc (basic_block);
  static int qty_sugg_compare (int, int);
  static int qty_sugg_compare_1 (const void *, const void *);
  static int qty_compare (int, int);
*************** static void reg_is_set (rtx, const_rtx, 
*** 311,317 ****
  static void reg_is_born (rtx, int);
  static void wipe_dead_reg (rtx, int);
  static int find_free_reg (enum reg_class, enum machine_mode, int, int, int,
! 			  int, int);
  static void mark_life (int, enum machine_mode, int);
  static void post_mark_life (int, enum machine_mode, int, int, int);
  static int requires_inout (const char *);
--- 311,317 ----
  static void reg_is_born (rtx, int);
  static void wipe_dead_reg (rtx, int);
  static int find_free_reg (enum reg_class, enum machine_mode, int, int, int,
! 			  int, int, basic_block);
  static void mark_life (int, enum machine_mode, int);
  static void post_mark_life (int, enum machine_mode, int, int, int);
  static int requires_inout (const char *);
*************** local_alloc (void)
*** 436,442 ****
  
        next_qty = 0;
  
!       block_alloc (b->index);
      }
  
    free (qty);
--- 436,442 ----
  
        next_qty = 0;
  
!       block_alloc (b);
      }
  
    free (qty);
*************** no_equiv (rtx reg, const_rtx store ATTRI
*** 1270,1276 ****
     Only the pseudos that die but once can be handled.  */
  
  static void
! block_alloc (int b)
  {
    int i, q;
    rtx insn;
--- 1270,1276 ----
     Only the pseudos that die but once can be handled.  */
  
  static void
! block_alloc (basic_block b)
  {
    int i, q;
    rtx insn;
*************** block_alloc (int b)
*** 1283,1289 ****
  
    /* Count the instructions in the basic block.  */
  
!   insn = BB_END (BASIC_BLOCK (b));
    while (1)
      {
        if (!NOTE_P (insn))
--- 1283,1289 ----
  
    /* Count the instructions in the basic block.  */
  
!   insn = BB_END (b);
    while (1)
      {
        if (!NOTE_P (insn))
*************** block_alloc (int b)
*** 1291,1297 ****
  	  ++insn_count;
  	  gcc_assert (insn_count <= max_uid);
  	}
!       if (insn == BB_HEAD (BASIC_BLOCK (b)))
  	break;
        insn = PREV_INSN (insn);
      }
--- 1291,1297 ----
  	  ++insn_count;
  	  gcc_assert (insn_count <= max_uid);
  	}
!       if (insn == BB_HEAD (b))
  	break;
        insn = PREV_INSN (insn);
      }
*************** block_alloc (int b)
*** 1302,1315 ****
  
    /* Initialize table of hardware registers currently live.  */
  
!   REG_SET_TO_HARD_REG_SET (regs_live, DF_LR_IN (BASIC_BLOCK (b)));
  
    /* This is conservative, as this would include registers that are
       artificial-def'ed-but-not-used.  However, artificial-defs are
       rare, and such uninitialized use is rarer still, and the chance
       of this having any performance impact is even less, while the
       benefit is not having to compute and keep the TOP set around.  */
!   for (def_rec = df_get_artificial_defs (b); *def_rec; def_rec++)
      {
        int regno = DF_REF_REGNO (*def_rec);
        if (regno < FIRST_PSEUDO_REGISTER)
--- 1302,1315 ----
  
    /* Initialize table of hardware registers currently live.  */
  
!   REG_SET_TO_HARD_REG_SET (regs_live, DF_LR_IN (b));
  
    /* This is conservative, as this would include registers that are
       artificial-def'ed-but-not-used.  However, artificial-defs are
       rare, and such uninitialized use is rarer still, and the chance
       of this having any performance impact is even less, while the
       benefit is not having to compute and keep the TOP set around.  */
!   for (def_rec = df_get_artificial_defs (b->index); *def_rec; def_rec++)
      {
        int regno = DF_REF_REGNO (*def_rec);
        if (regno < FIRST_PSEUDO_REGISTER)
*************** block_alloc (int b)
*** 1320,1326 ****
       and assigns quantities to registers.
       It computes which registers to tie.  */
  
!   insn = BB_HEAD (BASIC_BLOCK (b));
    while (1)
      {
        if (!NOTE_P (insn))
--- 1320,1326 ----
       and assigns quantities to registers.
       It computes which registers to tie.  */
  
!   insn = BB_HEAD (b);
    while (1)
      {
        if (!NOTE_P (insn))
*************** block_alloc (int b)
*** 1487,1493 ****
        IOR_HARD_REG_SET (regs_live_at[2 * insn_number], regs_live);
        IOR_HARD_REG_SET (regs_live_at[2 * insn_number + 1], regs_live);
  
!       if (insn == BB_END (BASIC_BLOCK (b)))
  	break;
  
        insn = NEXT_INSN (insn);
--- 1487,1493 ----
        IOR_HARD_REG_SET (regs_live_at[2 * insn_number], regs_live);
        IOR_HARD_REG_SET (regs_live_at[2 * insn_number + 1], regs_live);
  
!       if (insn == BB_END (b))
  	break;
  
        insn = NEXT_INSN (insn);
*************** block_alloc (int b)
*** 1542,1548 ****
        q = qty_order[i];
        if (qty_phys_num_sugg[q] != 0 || qty_phys_num_copy_sugg[q] != 0)
  	qty[q].phys_reg = find_free_reg (qty[q].min_class, qty[q].mode, q,
! 					 0, 1, qty[q].birth, qty[q].death);
        else
  	qty[q].phys_reg = -1;
      }
--- 1542,1548 ----
        q = qty_order[i];
        if (qty_phys_num_sugg[q] != 0 || qty_phys_num_copy_sugg[q] != 0)
  	qty[q].phys_reg = find_free_reg (qty[q].min_class, qty[q].mode, q,
! 					 0, 1, qty[q].birth, qty[q].death, b);
        else
  	qty[q].phys_reg = -1;
      }
*************** block_alloc (int b)
*** 1627,1645 ****
  		 a scheduling pass after reload and we are not optimizing
  		 for code size.  */
  	      if (flag_schedule_insns_after_reload && dbg_cnt (local_alloc_for_sched)
! 		  && !optimize_size
  		  && !SMALL_REGISTER_CLASSES)
  		{
  		  qty[q].phys_reg = find_free_reg (qty[q].min_class,
  						   qty[q].mode, q, 0, 0,
! 						   fake_birth, fake_death);
  		  if (qty[q].phys_reg >= 0)
  		    continue;
  		}
  #endif
  	      qty[q].phys_reg = find_free_reg (qty[q].min_class,
  					       qty[q].mode, q, 0, 0,
! 					       qty[q].birth, qty[q].death);
  	      if (qty[q].phys_reg >= 0)
  		continue;
  	    }
--- 1627,1645 ----
  		 a scheduling pass after reload and we are not optimizing
  		 for code size.  */
  	      if (flag_schedule_insns_after_reload && dbg_cnt (local_alloc_for_sched)
! 		  && optimize_bb_for_speed_p (b)
  		  && !SMALL_REGISTER_CLASSES)
  		{
  		  qty[q].phys_reg = find_free_reg (qty[q].min_class,
  						   qty[q].mode, q, 0, 0,
! 						   fake_birth, fake_death, b);
  		  if (qty[q].phys_reg >= 0)
  		    continue;
  		}
  #endif
  	      qty[q].phys_reg = find_free_reg (qty[q].min_class,
  					       qty[q].mode, q, 0, 0,
! 					       qty[q].birth, qty[q].death, b);
  	      if (qty[q].phys_reg >= 0)
  		continue;
  	    }
*************** block_alloc (int b)
*** 1647,1663 ****
  #ifdef INSN_SCHEDULING
  	  /* Similarly, avoid false dependencies.  */
  	  if (flag_schedule_insns_after_reload && dbg_cnt (local_alloc_for_sched)
! 	      && !optimize_size
  	      && !SMALL_REGISTER_CLASSES
  	      && qty[q].alternate_class != NO_REGS)
  	    qty[q].phys_reg = find_free_reg (qty[q].alternate_class,
  					     qty[q].mode, q, 0, 0,
! 					     fake_birth, fake_death);
  #endif
  	  if (qty[q].alternate_class != NO_REGS)
  	    qty[q].phys_reg = find_free_reg (qty[q].alternate_class,
  					     qty[q].mode, q, 0, 0,
! 					     qty[q].birth, qty[q].death);
  	}
      }
  
--- 1647,1663 ----
  #ifdef INSN_SCHEDULING
  	  /* Similarly, avoid false dependencies.  */
  	  if (flag_schedule_insns_after_reload && dbg_cnt (local_alloc_for_sched)
! 	      && optimize_bb_for_speed_p (b)
  	      && !SMALL_REGISTER_CLASSES
  	      && qty[q].alternate_class != NO_REGS)
  	    qty[q].phys_reg = find_free_reg (qty[q].alternate_class,
  					     qty[q].mode, q, 0, 0,
! 					     fake_birth, fake_death, b);
  #endif
  	  if (qty[q].alternate_class != NO_REGS)
  	    qty[q].phys_reg = find_free_reg (qty[q].alternate_class,
  					     qty[q].mode, q, 0, 0,
! 					     qty[q].birth, qty[q].death, b);
  	}
      }
  
*************** wipe_dead_reg (rtx reg, int output_p)
*** 2145,2151 ****
  static int
  find_free_reg (enum reg_class rclass, enum machine_mode mode, int qtyno,
  	       int accept_call_clobbered, int just_try_suggested,
! 	       int born_index, int dead_index)
  {
    int i, ins;
    HARD_REG_SET first_used, used;
--- 2145,2151 ----
  static int
  find_free_reg (enum reg_class rclass, enum machine_mode mode, int qtyno,
  	       int accept_call_clobbered, int just_try_suggested,
! 	       int born_index, int dead_index, basic_block bb)
  {
    int i, ins;
    HARD_REG_SET first_used, used;
*************** find_free_reg (enum reg_class rclass, en
*** 2261,2267 ****
        /* Don't try the copy-suggested regs again.  */
        qty_phys_num_copy_sugg[qtyno] = 0;
        return find_free_reg (rclass, mode, qtyno, accept_call_clobbered, 1,
! 			    born_index, dead_index);
      }
  
    /* We need not check to see if the current function has nonlocal
--- 2261,2267 ----
        /* Don't try the copy-suggested regs again.  */
        qty_phys_num_copy_sugg[qtyno] = 0;
        return find_free_reg (rclass, mode, qtyno, accept_call_clobbered, 1,
! 			    born_index, dead_index, bb);
      }
  
    /* We need not check to see if the current function has nonlocal
*************** find_free_reg (enum reg_class rclass, en
*** 2274,2284 ****
        && ! just_try_suggested
        && qty[qtyno].n_calls_crossed != 0
        && qty[qtyno].n_throwing_calls_crossed == 0
!       && CALLER_SAVE_PROFITABLE (optimize_size ? qty[qtyno].n_refs : qty[qtyno].freq,
! 				 optimize_size ? qty[qtyno].n_calls_crossed
  				 : qty[qtyno].freq_calls_crossed))
      {
!       i = find_free_reg (rclass, mode, qtyno, 1, 0, born_index, dead_index);
        if (i >= 0)
  	caller_save_needed = 1;
        return i;
--- 2274,2285 ----
        && ! just_try_suggested
        && qty[qtyno].n_calls_crossed != 0
        && qty[qtyno].n_throwing_calls_crossed == 0
!       && CALLER_SAVE_PROFITABLE (optimize_bb_for_size_p (bb) ? qty[qtyno].n_refs
!       				 : qty[qtyno].freq,
! 				 optimize_bb_for_size_p (bb) ? qty[qtyno].n_calls_crossed
  				 : qty[qtyno].freq_calls_crossed))
      {
!       i = find_free_reg (rclass, mode, qtyno, 1, 0, born_index, dead_index, bb);
        if (i >= 0)
  	caller_save_needed = 1;
        return i;
Index: gcse.c
===================================================================
*** gcse.c	(revision 139737)
--- gcse.c	(working copy)
*************** gcse_main (rtx f ATTRIBUTE_UNUSED)
*** 738,746 ****
  	  timevar_pop (TV_CPROP1);
  	}
  
!       if (optimize_size)
! 	/* Do nothing.  */ ;
!       else
  	{
  	  timevar_push (TV_PRE);
  	  changed |= one_pre_gcse_pass (pass + 1);
--- 738,744 ----
  	  timevar_pop (TV_CPROP1);
  	}
  
!       if (optimize_function_for_speed_p (cfun))
  	{
  	  timevar_push (TV_PRE);
  	  changed |= one_pre_gcse_pass (pass + 1);
*************** gcse_main (rtx f ATTRIBUTE_UNUSED)
*** 773,779 ****
  	 for code size -- it rarely makes programs faster, and can make
  	 them bigger if we did partial redundancy elimination (when optimizing
  	 for space, we don't run the partial redundancy algorithms).  */
!       if (optimize_size)
  	{
  	  timevar_push (TV_HOIST);
  	  max_gcse_regno = max_reg_num ();
--- 771,777 ----
  	 for code size -- it rarely makes programs faster, and can make
  	 them bigger if we did partial redundancy elimination (when optimizing
  	 for space, we don't run the partial redundancy algorithms).  */
!       if (optimize_function_for_size_p (cfun))
  	{
  	  timevar_push (TV_HOIST);
  	  max_gcse_regno = max_reg_num ();
*************** gcse_main (rtx f ATTRIBUTE_UNUSED)
*** 825,831 ****
    /* We are finished with alias.  */
    end_alias_analysis ();
  
!   if (!optimize_size && flag_gcse_sm)
      {
        timevar_push (TV_LSM);
        store_motion ();
--- 823,829 ----
    /* We are finished with alias.  */
    end_alias_analysis ();
  
!   if (optimize_function_for_speed_p (cfun) && flag_gcse_sm)
      {
        timevar_push (TV_LSM);
        store_motion ();
Index: loop-unroll.c
===================================================================
*** loop-unroll.c	(revision 139737)
--- loop-unroll.c	(working copy)
*************** decide_unrolling_and_peeling (int flags)
*** 269,275 ****
  	fprintf (dump_file, "\n;; *** Considering loop %d ***\n", loop->num);
  
        /* Do not peel cold areas.  */
!       if (!maybe_hot_bb_p (loop->header))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, ";; Not considering loop, cold area\n");
--- 269,275 ----
  	fprintf (dump_file, "\n;; *** Considering loop %d ***\n", loop->num);
  
        /* Do not peel cold areas.  */
!       if (optimize_loop_for_size_p (loop))
  	{
  	  if (dump_file)
  	    fprintf (dump_file, ";; Not considering loop, cold area\n");
*************** decide_peel_completely (struct loop *loo
*** 368,374 ****
      }
  
    /* Do not peel cold areas.  */
!   if (!maybe_hot_bb_p (loop->header))
      {
        if (dump_file)
  	fprintf (dump_file, ";; Not considering loop, cold area\n");
--- 368,374 ----
      }
  
    /* Do not peel cold areas.  */
!   if (optimize_loop_for_size_p (loop))
      {
        if (dump_file)
  	fprintf (dump_file, ";; Not considering loop, cold area\n");
Index: tree-vect-analyze.c
===================================================================
*** tree-vect-analyze.c	(revision 139737)
--- tree-vect-analyze.c	(working copy)
*************** vect_mark_for_runtime_alias_test (ddr_p 
*** 1219,1225 ****
        print_generic_expr (vect_dump, DR_REF (DDR_B (ddr)), TDF_SLIM);
      }
  
!   if (optimize_size)
      {
        if (vect_print_dump_info (REPORT_DR_DETAILS))
  	fprintf (vect_dump, "versioning not supported when optimizing for size.");
--- 1219,1225 ----
        print_generic_expr (vect_dump, DR_REF (DDR_B (ddr)), TDF_SLIM);
      }
  
!   if (optimize_loop_for_size_p (loop))
      {
        if (vect_print_dump_info (REPORT_DR_DETAILS))
  	fprintf (vect_dump, "versioning not supported when optimizing for size.");
*************** vect_enhance_data_refs_alignment (loop_v
*** 1993,1999 ****
  
    /* Try versioning if:
       1) flag_tree_vect_loop_version is TRUE
!      2) optimize_size is FALSE
       3) there is at least one unsupported misaligned data ref with an unknown
          misalignment, and
       4) all misaligned data refs with a known misalignment are supported, and
--- 1993,1999 ----
  
    /* Try versioning if:
       1) flag_tree_vect_loop_version is TRUE
!      2) optimize loop for speed
       3) there is at least one unsupported misaligned data ref with an unknown
          misalignment, and
       4) all misaligned data refs with a known misalignment are supported, and
*************** vect_enhance_data_refs_alignment (loop_v
*** 2001,2007 ****
  
    do_versioning = 
  	flag_tree_vect_loop_version 
! 	&& (!optimize_size)
  	&& (!loop->inner); /* FORNOW */
  
    if (do_versioning)
--- 2001,2007 ----
  
    do_versioning = 
  	flag_tree_vect_loop_version 
! 	&& optimize_loop_for_speed_p (loop)
  	&& (!loop->inner); /* FORNOW */
  
    if (do_versioning)
Index: tree-ssa-coalesce.c
===================================================================
*** tree-ssa-coalesce.c	(revision 139737)
--- tree-ssa-coalesce.c	(working copy)
*************** typedef struct coalesce_list_d 
*** 75,81 ****
     possibly on CRITICAL edge and in HOT basic block.  */
  
  static inline int
! coalesce_cost (int frequency, bool hot, bool critical)
  {
    /* Base costs on BB frequencies bounded by 1.  */
    int cost = frequency;
--- 75,81 ----
     possibly on CRITICAL edge and in HOT basic block.  */
  
  static inline int
! coalesce_cost (int frequency, bool optimize_for_size, bool critical)
  {
    /* Base costs on BB frequencies bounded by 1.  */
    int cost = frequency;
*************** coalesce_cost (int frequency, bool hot, 
*** 83,94 ****
    if (!cost)
      cost = 1;
  
!   if (optimize_size)
      cost = 1;
-   else
-     /* It is more important to coalesce in HOT blocks.  */
-     if (hot)
-       cost *= 2;
  
    /* Inserting copy on critical edge costs more than inserting it elsewhere.  */
    if (critical)
--- 83,90 ----
    if (!cost)
      cost = 1;
  
!   if (optimize_for_size)
      cost = 1;
  
    /* Inserting copy on critical edge costs more than inserting it elsewhere.  */
    if (critical)
*************** coalesce_cost (int frequency, bool hot, 
*** 102,108 ****
  static inline int 
  coalesce_cost_bb (basic_block bb)
  {
!   return coalesce_cost (bb->frequency, maybe_hot_bb_p (bb), false);
  }
  
  
--- 98,104 ----
  static inline int 
  coalesce_cost_bb (basic_block bb)
  {
!   return coalesce_cost (bb->frequency, optimize_bb_for_size_p (bb), false);
  }
  
  
*************** coalesce_cost_edge (edge e)
*** 115,121 ****
      return MUST_COALESCE_COST;
  
    return coalesce_cost (EDGE_FREQUENCY (e), 
! 			maybe_hot_edge_p (e), 
  			EDGE_CRITICAL_P (e));
  }
  
--- 111,117 ----
      return MUST_COALESCE_COST;
  
    return coalesce_cost (EDGE_FREQUENCY (e), 
! 			optimize_edge_for_size_p (e), 
  			EDGE_CRITICAL_P (e));
  }
  
*************** create_outofssa_var_map (coalesce_list_p
*** 1099,1105 ****
  		    if (SSA_NAME_VAR (outputs[match]) == SSA_NAME_VAR (input))
  		      {
  			cost = coalesce_cost (REG_BR_PROB_BASE, 
! 					      maybe_hot_bb_p (bb),
  					      false);
  			add_coalesce (cl, v1, v2, cost);
  			bitmap_set_bit (used_in_copy, v1);
--- 1095,1101 ----
  		    if (SSA_NAME_VAR (outputs[match]) == SSA_NAME_VAR (input))
  		      {
  			cost = coalesce_cost (REG_BR_PROB_BASE, 
! 					      optimize_bb_for_size_p (bb),
  					      false);
  			add_coalesce (cl, v1, v2, cost);
  			bitmap_set_bit (used_in_copy, v1);
Index: cfgcleanup.c
===================================================================
*** cfgcleanup.c	(revision 139737)
--- cfgcleanup.c	(working copy)
*************** outgoing_edges_match (int mode, basic_bl
*** 1235,1243 ****
  	 we require the existing branches to have probabilities that are
  	 roughly similar.  */
        if (match
! 	  && !optimize_size
! 	  && maybe_hot_bb_p (bb1)
! 	  && maybe_hot_bb_p (bb2))
  	{
  	  int prob2;
  
--- 1235,1242 ----
  	 we require the existing branches to have probabilities that are
  	 roughly similar.  */
        if (match
! 	  && optimize_bb_for_speed_p (bb1)
! 	  && optimize_bb_for_speed_p (bb2))
  	{
  	  int prob2;
  
*************** try_crossjump_bb (int mode, basic_block 
*** 1684,1690 ****
  
    /* Don't crossjump if this block ends in a computed jump,
       unless we are optimizing for size.  */
!   if (!optimize_size
        && bb != EXIT_BLOCK_PTR
        && computed_jump_p (BB_END (bb)))
      return false;
--- 1683,1689 ----
  
    /* Don't crossjump if this block ends in a computed jump,
       unless we are optimizing for size.  */
!   if (optimize_bb_for_size_p (bb)
        && bb != EXIT_BLOCK_PTR
        && computed_jump_p (BB_END (bb)))
      return false;
Index: tree-ssa-loop-prefetch.c
===================================================================
*** tree-ssa-loop-prefetch.c	(revision 139737)
--- tree-ssa-loop-prefetch.c	(working copy)
*************** loop_prefetch_arrays (struct loop *loop)
*** 1460,1466 ****
    struct tree_niter_desc desc;
    bool unrolled = false, no_other_refs;
  
!   if (!maybe_hot_bb_p (loop->header))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "  ignored (cold area)\n");
--- 1460,1466 ----
    struct tree_niter_desc desc;
    bool unrolled = false, no_other_refs;
  
!   if (optimize_loop_for_speed_p (loop))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "  ignored (cold area)\n");
Index: bb-reorder.c
===================================================================
*** bb-reorder.c	(revision 139737)
--- bb-reorder.c	(working copy)
*************** find_traces_1_round (int branch_th, int 
*** 648,654 ****
  			  /* The loop has less than 4 iterations.  */
  
  			  if (single_succ_p (bb)
! 			      && copy_bb_p (best_edge->dest, !optimize_size))
  			    {
  			      bb = copy_bb (best_edge->dest, best_edge, bb,
  					    *n_traces);
--- 648,655 ----
  			  /* The loop has less than 4 iterations.  */
  
  			  if (single_succ_p (bb)
! 			      && copy_bb_p (best_edge->dest,
! 			      		    optimize_edge_for_speed_p (best_edge)))
  			    {
  			      bb = copy_bb (best_edge->dest, best_edge, bb,
  					    *n_traces);
*************** connect_traces (int n_traces, struct tra
*** 1102,1108 ****
  		 edge is traversed frequently enough.  */
  	      if (try_copy
  		  && copy_bb_p (best->dest,
! 				!optimize_size
  				&& EDGE_FREQUENCY (best) >= freq_threshold
  				&& best->count >= count_threshold))
  		{
--- 1103,1109 ----
  		 edge is traversed frequently enough.  */
  	      if (try_copy
  		  && copy_bb_p (best->dest,
! 				optimize_edge_for_speed_p (best)
  				&& EDGE_FREQUENCY (best) >= freq_threshold
  				&& best->count >= count_threshold))
  		{
*************** copy_bb_p (const_basic_block bb, int cod
*** 1173,1179 ****
    if (EDGE_COUNT (bb->succs) > 8)
      return false;
  
!   if (code_may_grow && maybe_hot_bb_p (bb))
      max_size *= PARAM_VALUE (PARAM_MAX_GROW_COPY_BB_INSNS);
  
    FOR_BB_INSNS (bb, insn)
--- 1174,1180 ----
    if (EDGE_COUNT (bb->succs) > 8)
      return false;
  
!   if (code_may_grow && optimize_bb_for_speed_p (bb))
      max_size *= PARAM_VALUE (PARAM_MAX_GROW_COPY_BB_INSNS);
  
    FOR_BB_INSNS (bb, insn)
*************** gate_duplicate_computed_gotos (void)
*** 1984,1990 ****
  {
    if (targetm.cannot_modify_jumps_p ())
      return false;
!   return (optimize > 0 && flag_expensive_optimizations && !optimize_size);
  }
  
  
--- 1985,1991 ----
  {
    if (targetm.cannot_modify_jumps_p ())
      return false;
!   return (optimize > 0 && flag_expensive_optimizations);
  }
  
  
*************** duplicate_computed_gotos (void)
*** 2075,2080 ****
--- 2076,2084 ----
  	  || single_pred_p (single_succ (bb)))
  	continue;
  
+       if (!optimize_bb_for_size_p (bb))
+ 	continue;
+ 
        /* The successor block has to be a duplication candidate.  */
        if (!bitmap_bit_p (candidates, single_succ (bb)->index))
  	continue;
Index: basic-block.h
===================================================================
*** basic-block.h	(revision 139737)
--- basic-block.h	(working copy)
*************** extern bool maybe_hot_bb_p (const_basic_
*** 831,844 ****
  extern bool maybe_hot_edge_p (edge);
  extern bool probably_cold_bb_p (const_basic_block);
  extern bool probably_never_executed_bb_p (const_basic_block);
! extern bool optimize_bb_for_size_p (basic_block);
! extern bool optimize_bb_for_speed_p (basic_block);
  extern bool optimize_edge_for_size_p (edge);
  extern bool optimize_edge_for_speed_p (edge);
  extern bool optimize_insn_for_size_p (void);
  extern bool optimize_insn_for_speed_p (void);
  extern bool optimize_function_for_size_p (struct function *);
  extern bool optimize_function_for_speed_p (struct function *);
  extern bool gimple_predicted_by_p (const_basic_block, enum br_predictor);
  extern bool rtl_predicted_by_p (const_basic_block, enum br_predictor);
  extern void gimple_predict_edge (edge, enum br_predictor, int);
--- 831,846 ----
  extern bool maybe_hot_edge_p (edge);
  extern bool probably_cold_bb_p (const_basic_block);
  extern bool probably_never_executed_bb_p (const_basic_block);
! extern bool optimize_bb_for_size_p (const_basic_block);
! extern bool optimize_bb_for_speed_p (const_basic_block);
  extern bool optimize_edge_for_size_p (edge);
  extern bool optimize_edge_for_speed_p (edge);
  extern bool optimize_insn_for_size_p (void);
  extern bool optimize_insn_for_speed_p (void);
  extern bool optimize_function_for_size_p (struct function *);
  extern bool optimize_function_for_speed_p (struct function *);
+ extern bool optimize_loop_for_size_p (struct loop *);
+ extern bool optimize_loop_for_speed_p (struct loop *);
  extern bool gimple_predicted_by_p (const_basic_block, enum br_predictor);
  extern bool rtl_predicted_by_p (const_basic_block, enum br_predictor);
  extern void gimple_predict_edge (edge, enum br_predictor, int);
Index: stmt.c
===================================================================
*** stmt.c	(revision 139737)
--- stmt.c	(working copy)
*************** expand_case (tree exp)
*** 2419,2425 ****
  
        else if (count < case_values_threshold ()
  	       || compare_tree_int (range,
! 				    (optimize_size ? 3 : 10) * count) > 0
  	       /* RANGE may be signed, and really large ranges will show up
  		  as negative numbers.  */
  	       || compare_tree_int (range, 0) < 0
--- 2419,2425 ----
  
        else if (count < case_values_threshold ()
  	       || compare_tree_int (range,
! 				    (optimize_insn_for_size_p () ? 3 : 10) * count) > 0
  	       /* RANGE may be signed, and really large ranges will show up
  		  as negative numbers.  */
  	       || compare_tree_int (range, 0) < 0
*************** expand_case (tree exp)
*** 2489,2495 ****
  
  	      /* Index jumptables from zero for suitable values of
                   minval to avoid a subtraction.  */
! 	      if (! optimize_size
  		  && compare_tree_int (minval, 0) > 0
  		  && compare_tree_int (minval, 3) < 0)
  		{
--- 2489,2495 ----
  
  	      /* Index jumptables from zero for suitable values of
                   minval to avoid a subtraction.  */
! 	      if (optimize_insn_for_speed_p ()
  		  && compare_tree_int (minval, 0) > 0
  		  && compare_tree_int (minval, 3) < 0)
  		{
Follow-Ups:
- Re: Update passes to use optimize_*_for_size_p tests
  - From: Richard Guenther
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]