[RFC/RFA] Interface for profile info and size optimization

Richard Guenther richard.guenther@gmail.com
Sun Jun 15 22:55:00 GMT 2008


On Fri, Jun 6, 2008 at 7:09 AM, Jan Hubicka <jh@suse.cz> wrote:
> Hi,
> this patch should make interface for profile easier.  As discussed few
> months ago http://gcc.gnu.org/ml/gcc-patches/2006-10/msg01371.html , the
> current predicates provided by profile infrastructure (maybe_hot,
> probably_cold, probably_never_executed) are somewhat hard to map to the
> question we are usually interested in, that is if optimize for size or
> speed.
>
> This patch adds optimize_for_size_p and optimize_for_speed_p predicates for
> BBs, edges and RTL expansion that implements the logic that everything that
> might be hot is optimized for speed unless -Os is specified or the function is
> explicitely marked as cold.
>
> There are contextes where we are really interested in the notion of
> coldness/hotness (ie when the tradeoffs are not involving code size, just
> optimizing one path on the cost of another), so I am keeping both interfaces.
>
> For RTL I've also added the maybe_hot_bb_p global flag.  This was
> discussed in the past and Roger suggested that global current_bb pointer
> is better interface.  I still think it is mistake, since current basic
> block is either wrong or inaccurate for several reasons:
>
>  1) during expansion current_bb points to the tree level basic block not
>    to the basic block instruction will end up in after splitting.  Cold
>    sections of expanded tree constructs would be considered hot then.
>    With separate flag, we can change it based on knowledge what we are
>    expanding now (when expanding, for instance, switch or string operation)
>  2) We emit instructions to edges, then current basic block is not defined
>  3) Low level RTL bits are not aware of CFG and probably will stay in this way.
>
> So instead of current_bb, I've tried to hid the information in the API
> rtl_profile_for_bb, rtl_profile_for_edge and default_rtl_profile so if we
> decide we want to pass more infromation currently present in CFG than just the
> hotness bits, we can easilly do that in those functions.
>
> The basic idea is that profile aware passes will use proper set function
> and reset the profile when done.  I would like to update all existing
> passes to be profile aware if we settle down on the API.  It should not
> be that dificult.
>
> Bootstrapped/regtested i686-linux, OK?

This looks reasonable, but I'd prefer the simple accessors in predict.c to
move to predict.h.

Please wait for additional comments,

Thanks,
Richard.

>        * predict.c (always_optimize_for_size_p): New function.
>        (optimize_bb_for_size_p, optimize_bb_for_speed_p,
>        optimize_edge_for_size_p, optimize_edge_for_speed_p,
>        optimize_insn_for_size_p, optimize_insn_for_speed_p): New global
>        functions.
>        (rtl_profile_for_bb, rtl_profile_for_edge, rtl_default_profile): New.
>        * function.c (prepare_function_start): Set default profile.
>        * function.h (rtl_data): Add maybe_hot_insn_p.
>        * cfgexpand.c (expand_gimple_basic_block): Set RTL profile.
>        (construct_exit_block): Likewise.
>        (tree_expand_cfg): Likewise.
>        * basic-block.h
>        (optimize_bb_for_size_p, optimize_bb_for_speed_p,
>        optimize_edge_for_size_p, optimize_edge_for_speed_p,
>        optimize_insn_for_size_p, optimize_insn_for_speed_p): Declare.
>        (rtl_profile_for_bb, rtl_profile_for_edge, default_rtl_profile):
>        Declare.
> Index: predict.c
> ===================================================================
> *** predict.c   (revision 136426)
> --- predict.c   (working copy)
> *************** probably_never_executed_bb_p (const_basi
> *** 178,183 ****
> --- 178,263 ----
>    return false;
>  }
>
> + /* Return true when current function should always be optimized for size.  */
> +
> + static bool
> + always_optimize_for_size_p (void)
> + {
> +   return (optimize_size
> +         || cfun->function_frequency == FUNCTION_FREQUENCY_UNLIKELY_EXECUTED);
> + }
> +
> + /* Return TRUE when BB should be optimized for size.  */
> +
> + bool
> + optimize_bb_for_size_p (basic_block bb)
> + {
> +   return always_optimize_for_size_p () || !maybe_hot_bb_p (bb);
> + }
> +
> + /* Return TRUE when BB should be optimized for speed.  */
> +
> + bool
> + optimize_bb_for_speed_p (basic_block bb)
> + {
> +   return !optimize_bb_for_size_p (bb);
> + }
> +
> + /* Return TRUE when BB should be optimized for size.  */
> +
> + bool
> + optimize_edge_for_size_p (edge e)
> + {
> +   return always_optimize_for_size_p () || !maybe_hot_bb_p (e);
> + }
> +
> + /* Return TRUE when BB should be optimized for speed.  */
> +
> + bool
> + optimize_edge_for_speed_p (edge e)
> + {
> +   return !optimize_edge_for_size_p (e);
> + }
> +
> + /* Return TRUE when BB should be optimized for size.  */
> +
> + bool
> + optimize_insn_for_size_p (void)
> + {
> +   return always_optimize_for_size_p () || !crtl->maybe_hot_insn_p;
> + }
> +
> + /* Return TRUE when BB should be optimized for speed.  */
> +
> + bool
> + optimize_insn_for_speed_p (void)
> + {
> +   return !optimize_insn_for_size_p ();
> + }
> +
> + /* Set RTL expansion for BB profile.  */
> +
> + void
> + rtl_profile_for_bb (basic_block bb)
> + {
> +   crtl->maybe_hot_insn_p = maybe_hot_bb_p (bb);
> + }
> +
> + /* Set RTL expansion for edge profile.  */
> +
> + void
> + rtl_profile_for_edge (edge e)
> + {
> +   crtl->maybe_hot_insn_p = maybe_hot_edge_p (e);
> + }
> +
> + /* Set RTL expansion to default mode (i.e. when profile info is not known).  */
> + void
> + default_rtl_profile (void)
> + {
> +   crtl->maybe_hot_insn_p = true;
> + }
> +
>  /* Return true if the one of outgoing edges is already predicted by
>     PREDICTOR.  */
>
> Index: function.c
> ===================================================================
> *** function.c  (revision 136426)
> --- function.c  (working copy)
> *************** prepare_function_start (void)
> *** 3908,3913 ****
> --- 3908,3914 ----
>    init_emit ();
>    init_varasm_status ();
>    init_expr ();
> +   default_rtl_profile ();
>
>    cse_not_expected = ! optimize;
>
> Index: function.h
> ===================================================================
> *** function.h  (revision 136426)
> --- function.h  (working copy)
> *************** struct rtl_data GTY(())
> *** 397,402 ****
> --- 397,405 ----
>       Set in stmt.c if anything is allocated on the stack there.
>       Set in reload1.c if anything is allocated on the stack there.  */
>    bool frame_pointer_needed;
> +
> +   /* When set, expand should optimize for speed.  */
> +   bool maybe_hot_insn_p;
>  };
>
>  #define return_label (crtl->x_return_label)
> Index: cfgexpand.c
> ===================================================================
> *** cfgexpand.c (revision 136426)
> --- cfgexpand.c (working copy)
> *************** expand_gimple_basic_block (basic_block b
> *** 1478,1483 ****
> --- 1478,1484 ----
>      }
>
>    bb->il.tree = NULL;
> +   rtl_profile_for_bb (bb);
>    init_rtl_bb_info (bb);
>    bb->flags |= BB_RTL;
>
> *************** construct_exit_block (void)
> *** 1710,1715 ****
> --- 1711,1718 ----
>    edge_iterator ei;
>    rtx orig_end = BB_END (EXIT_BLOCK_PTR->prev_bb);
>
> +   rtl_profile_for_bb (EXIT_BLOCK_PTR);
> +
>    /* Make sure the locus is set to the end of the function, so that
>       epilogue line numbers and warnings are set properly.  */
>    if (cfun->function_end_locus != UNKNOWN_LOCATION)
> *************** tree_expand_cfg (void)
> *** 1843,1848 ****
> --- 1846,1853 ----
>    /* Some backends want to know that we are expanding to RTL.  */
>    currently_expanding_to_rtl = 1;
>
> +   rtl_profile_for_bb (ENTRY_BLOCK_PTR);
> +
>    insn_locators_alloc ();
>    if (!DECL_BUILT_IN (current_function_decl))
>      set_curr_insn_source_location (DECL_SOURCE_LOCATION (current_function_decl));
> *************** tree_expand_cfg (void)
> *** 1906,1911 ****
> --- 1911,1919 ----
>    lab_rtx_for_bb = pointer_map_create ();
>    FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
>      bb = expand_gimple_basic_block (bb);
> +
> +   /* Expansion is used by optimization passes too, set maybe_hot_insn_p
> +      conservatively to true until they are all profile aware.  */
>    pointer_map_destroy (lab_rtx_for_bb);
>    free_histograms ();
>
> *************** tree_expand_cfg (void)
> *** 1974,1979 ****
> --- 1982,1988 ----
>    /* Tag the blocks with a depth number so that change_scope can find
>       the common parent easily.  */
>    set_block_levels (DECL_INITIAL (cfun->decl), 0);
> +   default_rtl_profile ();
>    return 0;
>  }
>
> Index: basic-block.h
> ===================================================================
> *** basic-block.h       (revision 136426)
> --- basic-block.h       (working copy)
> *************** extern bool maybe_hot_bb_p (const_basic_
> *** 830,835 ****
> --- 830,841 ----
>  extern bool maybe_hot_edge_p (edge);
>  extern bool probably_cold_bb_p (const_basic_block);
>  extern bool probably_never_executed_bb_p (const_basic_block);
> + extern bool optimize_bb_for_size_p (basic_block);
> + extern bool optimize_bb_for_speed_p (basic_block);
> + extern bool optimize_edge_for_size_p (edge);
> + extern bool optimize_edge_for_speed_p (edge);
> + extern bool optimize_insn_for_size_p (void);
> + extern bool optimize_insn_for_speed_p (void);
>  extern bool tree_predicted_by_p (const_basic_block, enum br_predictor);
>  extern bool rtl_predicted_by_p (const_basic_block, enum br_predictor);
>  extern void tree_predict_edge (edge, enum br_predictor, int);
> *************** bb_has_abnormal_pred (basic_block bb)
> *** 987,992 ****
>
>  /* In cfgloopmanip.c.  */
>  extern edge mfb_kj_edge;
> ! bool mfb_keep_just (edge);
>
>  #endif /* GCC_BASIC_BLOCK_H */
> --- 993,1003 ----
>
>  /* In cfgloopmanip.c.  */
>  extern edge mfb_kj_edge;
> ! extern bool mfb_keep_just (edge);
> !
> ! /* In cfgexpand.c.  */
> ! extern void rtl_profile_for_bb (basic_block);
> ! extern void rtl_profile_for_edge (edge);
> ! extern void default_rtl_profile (void);
>
>  #endif /* GCC_BASIC_BLOCK_H */
> Index: config/i386/i386.c
> ===================================================================
> *** config/i386/i386.c  (revision 136426)
> --- config/i386/i386.c  (working copy)
> *************** standard_80387_constant_p (rtx x)
> *** 5746,5752 ****
>    /* For XFmode constants, try to find a special 80387 instruction when
>       optimizing for size or on those CPUs that benefit from them.  */
>    if (mode == XFmode
> !       && (optimize_size || TARGET_EXT_80387_CONSTANTS))
>      {
>        int i;
>
> --- 5746,5752 ----
>    /* For XFmode constants, try to find a special 80387 instruction when
>       optimizing for size or on those CPUs that benefit from them.  */
>    if (mode == XFmode
> !       && (optimize_insn_for_size_p () || TARGET_EXT_80387_CONSTANTS))
>      {
>        int i;
>
> *************** decide_alg (HOST_WIDE_INT count, HOST_WI
> *** 15447,15458 ****
>                           || (alg != rep_prefix_1_byte         \
>                               && alg != rep_prefix_4_byte      \
>                               && alg != rep_prefix_8_byte))
>
>    *dynamic_check = -1;
>    if (memset)
> !     algs = &ix86_cost->memset[TARGET_64BIT != 0];
>    else
> !     algs = &ix86_cost->memcpy[TARGET_64BIT != 0];
>    if (stringop_alg != no_stringop && ALG_USABLE_P (stringop_alg))
>      return stringop_alg;
>    /* rep; movq or rep; movl is the smallest variant.  */
> --- 15447,15461 ----
>                           || (alg != rep_prefix_1_byte         \
>                               && alg != rep_prefix_4_byte      \
>                               && alg != rep_prefix_8_byte))
> +   const struct processor_costs *cost;
> +
> +   cost = optimize_insn_for_size_p () ? &size_cost : ix86_cost;
>
>    *dynamic_check = -1;
>    if (memset)
> !     algs = &cost->memset[TARGET_64BIT != 0];
>    else
> !     algs = &cost->memcpy[TARGET_64BIT != 0];
>    if (stringop_alg != no_stringop && ALG_USABLE_P (stringop_alg))
>      return stringop_alg;
>    /* rep; movq or rep; movl is the smallest variant.  */
>



More information about the Gcc-patches mailing list