This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Remove array_index inliner hint
JiangNing OS <jiangning@os.amperecomputing.com> writes:
> Hi Honza,
>
> It seems this commit caused gcc bootstrap failure on aarch64 as below,
>
> Comparing stages 2 and 3
> warning: gcc/cc1obj-checksum.o differs
> Bootstrap comparison failure!
> gcc/dwarf2out.o differs
> gcc/var-tracking.o differs
> gcc/aarch64.o differs
> make[2]: *** [compare] Error 1
>
> Do you know the reason? I'm not sure if you have an aarch64 machine to try. Can it be fixed?
I've applied Honza's fix from PR91176 after bootstrapping & regtesting
on aarch64-linux-gnu.
Thanks Honza for the quick fix!
Richard
>
> Thanks,
> -Jiangning
>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
>> On Behalf Of Jan Hubicka
>> Sent: Sunday, July 14, 2019 8:03 PM
>> To: gcc-patches@gcc.gnu.org
>> Subject: Remove array_index inliner hint
>>
>> Hi,
>> array_index hint marks functions contains an array reference that is indexed
>> by value that will become constant after inlining.
>> This hint is later used by ipa-inline in rather agressive way updating inline-
>> insns-auto to inline-insns-single for such functions. While tuning -finline-
>> functions for -O2 I have noticed that this leads to 30% code size growth of
>> SPEC versions of GCC. This is because all predicates like register_operand
>> contains such array references based on mode parameter and this makes
>> GCC to conclude about agressively inlining them all which bloats
>> insn-* a lot.
>>
>> This hint was quite early experiment with propagating additional stuff
>> through inliner. I think it is better to actualy account the instructions which
>> will be generated for array indexing rather then handling this specially.
>>
>> This patch makes us to accound 1 instruction for every non-constant array
>> access. This is still probably making inliner overly optimistic about inlining
>> benefits since these accesses will later be CSEed, but it kills bit of magic and
>> makes things more robust.
>>
>> This will make inliner notice that function will simplify and give it higher
>> priority for inlining possibly still bypassing the bounds if big speedup is
>> achieved. This is however a lot more rare than before.
>>
>> Bootstrapped/regtested x86_64-linux, comitted.
>>
>> Honza
>>
>> * ipa-fnsummary.c (ipa_dump_hints): Do not dump array_index.
>> (ipa_fn_summary::~ipa_fn_summary): Do not destroy array_index.
>> (ipa_fn_summary_t::duplicate): Do not duplicate array_index.
>> (array_index_predicate): Remove.
>> (analyze_function_body): Account cost for variable ofsetted array
>> indexing.
>> (estimate_node_size_and_time): Do not compute array index hint.
>> (ipa_merge_fn_summary_after_inlining): Do not merge array index
>> hint.
>> (inline_read_section): Do not read array index hint.
>> (ipa_fn_summary_write): Do not write array index hint.
>> * doc/invoke.texi (ipa-cp-array-index-hint-bonus): Remove.
>> * ipa-cp.c (hint_time_bonus): Remove.
>> * ipa-fnsummary.h (ipa_hints_vals): Remove array_index.
>> (ipa_fnsummary): Remove array_index.
>> * ipa-inline.c (want_inline_small_function_p): Do not use
>> array_index.
>> (edge_badness): Likewise.
>> * params.def (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS):
>> Remove.
>>
>> Index: doc/invoke.texi
>> ================================================================
>> ===
>> --- doc/invoke.texi (revision 273450)
>> +++ doc/invoke.texi (working copy)
>> @@ -11895,12 +11895,6 @@ of iterations of a loop known, it adds a
>> @option{ipa-cp-loop-hint-bonus} to the profitability score of the candidate.
>>
>> -@item ipa-cp-array-index-hint-bonus
>> -When IPA-CP determines that a cloning candidate would make the index of -
>> an array access known, it adds a bonus of -@option{ipa-cp-array-index-hint-
>> bonus} to the profitability -score of the candidate.
>> -
>> @item ipa-max-aa-steps
>> During its analysis of function bodies, IPA-CP employs alias analysis in order
>> to track values pointed to by function parameters. In order
>> Index: ipa-cp.c
>> ================================================================
>> ===
>> --- ipa-cp.c (revision 273450)
>> +++ ipa-cp.c (working copy)
>> @@ -2607,8 +2607,6 @@ hint_time_bonus (ipa_hints hints)
>> int result = 0;
>> if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride))
>> result += PARAM_VALUE (PARAM_IPA_CP_LOOP_HINT_BONUS);
>> - if (hints & INLINE_HINT_array_index)
>> - result += PARAM_VALUE (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS);
>> return result;
>> }
>>
>> Index: ipa-fnsummary.c
>> ================================================================
>> ===
>> --- ipa-fnsummary.c (revision 273450)
>> +++ ipa-fnsummary.c (working copy)
>> @@ -134,11 +134,6 @@ ipa_dump_hints (FILE *f, ipa_hints hints
>> hints &= ~INLINE_HINT_declared_inline;
>> fprintf (f, " declared_inline");
>> }
>> - if (hints & INLINE_HINT_array_index)
>> - {
>> - hints &= ~INLINE_HINT_array_index;
>> - fprintf (f, " array_index");
>> - }
>> if (hints & INLINE_HINT_known_hot)
>> {
>> hints &= ~INLINE_HINT_known_hot;
>> @@ -549,8 +544,6 @@ ipa_fn_summary::~ipa_fn_summary ()
>> edge_predicate_pool.remove (loop_iterations);
>> if (loop_stride)
>> edge_predicate_pool.remove (loop_stride);
>> - if (array_index)
>> - edge_predicate_pool.remove (array_index);
>> vec_free (conds);
>> vec_free (size_time_table);
>> }
>> @@ -703,8 +696,6 @@ ipa_fn_summary_t::duplicate (cgraph_node
>> possible_truths);
>> remap_hint_predicate_after_duplication (&info->loop_stride,
>> possible_truths);
>> - remap_hint_predicate_after_duplication (&info->array_index,
>> - possible_truths);
>>
>> /* If inliner or someone after inliner will ever start producing
>> non-trivial clones, we will get trouble with lack of information @@ -
>> 727,12 +718,6 @@ ipa_fn_summary_t::duplicate (cgraph_node
>> info->loop_stride = NULL;
>> set_hint_predicate (&info->loop_stride, p);
>> }
>> - if (info->array_index)
>> - {
>> - predicate p = *info->array_index;
>> - info->array_index = NULL;
>> - set_hint_predicate (&info->array_index, p);
>> - }
>> }
>> if (!dst->global.inlined_to)
>> ipa_update_overall_fn_summary (dst); @@ -894,11 +879,6 @@
>> ipa_dump_fn_summary (FILE *f, struct cgr
>> fprintf (f, " loop stride:");
>> s->loop_stride->dump (f, s->conds);
>> }
>> - if (s->array_index)
>> - {
>> - fprintf (f, " array index:");
>> - s->array_index->dump (f, s->conds);
>> - }
>> fprintf (f, " calls:\n");
>> dump_ipa_call_summary (f, 4, node, s);
>> fprintf (f, "\n");
>> @@ -1824,27 +1804,6 @@ predicate_for_phi_result (class ipa_fn_s
>> nonconstant_names[SSA_NAME_VERSION (gimple_phi_result (phi))] = *p; }
>>
>> -/* Return predicate specifying when array index in access OP becomes non-
>> constant. */
>> -
>> -static predicate
>> -array_index_predicate (ipa_fn_summary *info,
>> - vec< predicate> nonconstant_names, tree op)
>> -{
>> - predicate p = false;
>> - while (handled_component_p (op))
>> - {
>> - if (TREE_CODE (op) == ARRAY_REF || TREE_CODE (op) ==
>> ARRAY_RANGE_REF)
>> - {
>> - if (TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME)
>> - p = p.or_with (info->conds,
>> - nonconstant_names[SSA_NAME_VERSION
>> - (TREE_OPERAND (op, 1))]);
>> - }
>> - op = TREE_OPERAND (op, 0);
>> - }
>> - return p;
>> -}
>> -
>> /* For a typical usage of __builtin_expect (a<b, 1), we
>> may introduce an extra relation stmt:
>> With the builtin, we have
>> @@ -2001,7 +1960,6 @@ analyze_function_body (struct cgraph_nod
>> vec<predicate> nonconstant_names = vNULL;
>> int nblocks, n;
>> int *order;
>> - predicate array_index = true;
>> gimple *fix_builtin_expect_stmt;
>>
>> gcc_assert (my_function && my_function->cfg); @@ -2146,26 +2104,6 @@
>> analyze_function_body (struct cgraph_nod
>> this_time);
>> }
>>
>> - if (gimple_assign_load_p (stmt) && nonconstant_names.exists ())
>> - {
>> - predicate this_array_index;
>> - this_array_index =
>> - array_index_predicate (info, nonconstant_names,
>> - gimple_assign_rhs1 (stmt));
>> - if (this_array_index != false)
>> - array_index &= this_array_index;
>> - }
>> - if (gimple_store_p (stmt) && nonconstant_names.exists ())
>> - {
>> - predicate this_array_index;
>> - this_array_index =
>> - array_index_predicate (info, nonconstant_names,
>> - gimple_get_lhs (stmt));
>> - if (this_array_index != false)
>> - array_index &= this_array_index;
>> - }
>> -
>> -
>> if (is_gimple_call (stmt)
>> && !gimple_call_internal_p (stmt))
>> {
>> @@ -2273,14 +2211,40 @@ analyze_function_body (struct cgraph_nod
>> if (dump_file)
>> fprintf (dump_file, " fp_expression set\n");
>> }
>> + }
>>
>> - gcc_assert (time >= 0);
>> - gcc_assert (size >= 0);
>> + /* Account cost of address calculations in the statements. */
>> + for (unsigned int i = 0; i < gimple_num_ops (stmt); i++)
>> + {
>> + for (tree op = gimple_op (stmt, i);
>> + op && handled_component_p (op);
>> + op = TREE_OPERAND (op, 0))
>> + if ((TREE_CODE (op) == ARRAY_REF
>> + || TREE_CODE (op) == ARRAY_RANGE_REF)
>> + && TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME)
>> + {
>> + predicate p = bb_predicate;
>> + if (fbi.info)
>> + p = p & will_be_nonconstant_expr_predicate
>> + (&fbi, info, TREE_OPERAND (op, 1),
>> + nonconstant_names);
>> + if (p != false)
>> + {
>> + time += freq;
>> + size += 1;
>> + if (dump_file)
>> + fprintf (dump_file,
>> + "\t\tAccounting address calculation.\n");
>> + info->account_size_time
>> (ipa_fn_summary::size_scale,
>> + freq,
>> + bb_predicate,
>> + p);
>> + }
>> + }
>> }
>> +
>> }
>> }
>> - set_hint_predicate (&ipa_fn_summaries->get_create (node)->array_index,
>> - array_index);
>> free (order);
>>
>> if (nonconstant_names.exists () && !early) @@ -2783,9 +2747,6 @@
>> estimate_node_size_and_time (struct cgra
>> if (info->loop_stride
>> && !info->loop_stride->evaluate (possible_truths))
>> hints |= INLINE_HINT_loop_stride;
>> - if (info->array_index
>> - && !info->array_index->evaluate (possible_truths))
>> - hints |= INLINE_HINT_array_index;
>> if (info->scc_no)
>> hints |= INLINE_HINT_in_scc;
>> if (DECL_DECLARED_INLINE_P (node->decl)) @@ -3106,9 +3067,6 @@
>> ipa_merge_fn_summary_after_inlining (str
>> remap_hint_predicate (info, callee_info,
>> &callee_info->loop_stride,
>> operand_map, offset_map, clause,
>> &toplev_predicate);
>> - remap_hint_predicate (info, callee_info,
>> - &callee_info->array_index,
>> - operand_map, offset_map, clause,
>> &toplev_predicate);
>>
>> ipa_call_summary *s = ipa_call_summaries->get (edge);
>> inline_update_callee_summaries (edge->callee, s->loop_depth); @@ -
>> 3366,9 +3324,6 @@ inline_read_section (struct lto_file_dec
>> p.stream_in (&ib);
>> if (info)
>> set_hint_predicate (&info->loop_stride, p);
>> - p.stream_in (&ib);
>> - if (info)
>> - set_hint_predicate (&info->array_index, p);
>> for (e = node->callees; e; e = e->next_callee)
>> read_ipa_call_summary (&ib, e, info != NULL);
>> for (e = node->indirect_calls; e; e = e->next_callee) @@ -3517,10 +3472,6
>> @@ ipa_fn_summary_write (void)
>> info->loop_stride->stream_out (ob);
>> else
>> streamer_write_uhwi (ob, 0);
>> - if (info->array_index)
>> - info->array_index->stream_out (ob);
>> - else
>> - streamer_write_uhwi (ob, 0);
>> for (edge = cnode->callees; edge; edge = edge->next_callee)
>> write_ipa_call_summary (ob, edge);
>> for (edge = cnode->indirect_calls; edge; edge = edge->next_callee)
>> Index: ipa-fnsummary.h
>> ================================================================
>> ===
>> --- ipa-fnsummary.h (revision 273450)
>> +++ ipa-fnsummary.h (working copy)
>> @@ -48,11 +48,8 @@ enum ipa_hints_vals {
>> if functions are in different modules, inlining may not be so important.
>> Set by simple_edge_hints in ipa-inline-analysis.c. */
>> INLINE_HINT_cross_module = 64,
>> - /* If array indexes of loads/stores become known there may be room for
>> - further optimization. */
>> - INLINE_HINT_array_index = 128,
>> /* We know that the callee is hot by profile. */
>> - INLINE_HINT_known_hot = 256
>> + INLINE_HINT_known_hot = 128
>> };
>>
>> typedef int ipa_hints;
>> @@ -97,7 +94,7 @@ public:
>> fp_expressions (false), estimated_stack_size (false),
>> stack_frame_offset (false), time (0), size (0), conds (NULL),
>> size_time_table (NULL), loop_iterations (NULL), loop_stride (NULL),
>> - array_index (NULL), growth (0), scc_no (0)
>> + growth (0), scc_no (0)
>> {
>> }
>>
>> @@ -111,7 +108,7 @@ public:
>> stack_frame_offset (s.stack_frame_offset), time (s.time), size (s.size),
>> conds (s.conds), size_time_table (s.size_time_table),
>> loop_iterations (s.loop_iterations), loop_stride (s.loop_stride),
>> - array_index (s.array_index), growth (s.growth), scc_no (s.scc_no)
>> + growth (s.growth), scc_no (s.scc_no)
>> {}
>>
>> /* Default constructor. */
>> @@ -157,8 +154,6 @@ public:
>> /* Predicate on when some loop in the function becomes to have known
>> stride. */
>> predicate * GTY((skip)) loop_stride;
>> - /* Predicate on when some array indexes become constants. */
>> - predicate * GTY((skip)) array_index;
>> /* Estimated growth for inlining all copies of the function before start
>> of small functions inlining.
>> This value will get out of date as the callers are duplicated, but
>> Index: ipa-inline.c
>> ================================================================
>> ===
>> --- ipa-inline.c (revision 273450)
>> +++ ipa-inline.c (working copy)
>> @@ -807,7 +807,6 @@ want_inline_small_function_p (struct cgr
>> || (!(hints & (INLINE_HINT_indirect_call
>> | INLINE_HINT_known_hot
>> | INLINE_HINT_loop_iterations
>> - | INLINE_HINT_array_index
>> | INLINE_HINT_loop_stride))
>> && !(big_speedup = big_speedup_p (e)))))
>> {
>> @@ -833,7 +832,6 @@ want_inline_small_function_p (struct cgr
>> && !(hints & INLINE_HINT_known_hot)
>> && growth >= ((hints & (INLINE_HINT_indirect_call
>> | INLINE_HINT_loop_iterations
>> - | INLINE_HINT_array_index
>> | INLINE_HINT_loop_stride))
>> ? MAX (MAX_INLINE_INSNS_AUTO,
>> MAX_INLINE_INSNS_SINGLE)
>> @@ -1227,7 +1225,6 @@ edge_badness (struct cgraph_edge *edge,
>> badness = badness.shift (badness > 0 ? 4 : -4);
>> if ((hints & (INLINE_HINT_indirect_call
>> | INLINE_HINT_loop_iterations
>> - | INLINE_HINT_array_index
>> | INLINE_HINT_loop_stride))
>> || callee_info->growth <= 0)
>> badness = badness.shift (badness > 0 ? -2 : 2);
>> Index: params.def
>> ================================================================
>> ===
>> --- params.def (revision 273450)
>> +++ params.def (working copy)
>> @@ -1109,12 +1109,6 @@ DEFPARAM (PARAM_IPA_CP_LOOP_HINT_BONUS,
>> "bounds or strides known.",
>> 64, 0, 0)
>>
>> -DEFPARAM (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS,
>> - "ipa-cp-array-index-hint-bonus",
>> - "Compile-time bonus IPA-CP assigns to candidates which make an
>> array "
>> - "index known.",
>> - 48, 0, 0)
>> -
>> DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
>> "ipa-max-aa-steps",
>> "Maximum number of statements that will be visited by IPA formal "