[PATCH 3/4] New IPA-SRA implementation
Martin Jambor
mjambor@suse.cz
Tue Sep 24 17:43:00 GMT 2019
Hi,
sorry for replying so late, I still haven't recovered from two weeks of
traveling and conferences.
On Sat, Sep 21 2019, Richard Sandiford wrote:
> Hi,
>
> Thanks for doing this.
>
> Martin Jambor <mjambor@suse.cz> writes:
>> +/* Analyze function body scan results stored in param_accesses and
>> + param_accesses, detect possible transformations and store information of
>> + those in function summary. NODE, FUN and IFS are all various structures
>> + describing the currently analyzed function. */
>> +
>> +static void
>> +process_scan_results (cgraph_node *node, struct function *fun,
>> + isra_func_summary *ifs,
>> + vec<gensum_param_desc> *param_descriptions)
>> +{
>> + bool check_pass_throughs = false;
>> + bool dereferences_propagated = false;
>> + tree parm = DECL_ARGUMENTS (node->decl);
>> + unsigned param_count = param_descriptions->length();
>> +
>> + for (unsigned desc_index = 0;
>> + desc_index < param_count;
>> + desc_index++, parm = DECL_CHAIN (parm))
>> + {
>> + gensum_param_desc *desc = &(*param_descriptions)[desc_index];
>> + if (!desc->locally_unused && !desc->split_candidate)
>> + continue;
>
> I'm jumping in the middle without working through the whole pass,
> so this is probably a daft question sorry, but: what is this loop
> required to do when:
>
> !desc->split_candidate && desc->locally_unused
You have figured out correctly that this is a thinko. I meant not to
continue for non-register-types which might not be used locally but
their locally_unused flag is only set a few lines below...
>
> ? AFAICT...
>
>> +
>> + if (flag_checking)
>> + isra_verify_access_tree (desc->accesses);
>> +
>> + if (!dereferences_propagated
>> + && desc->by_ref
>> + && desc->accesses)
>> + {
>> + propagate_dereference_distances (fun);
>> + dereferences_propagated = true;
>> + }
>> +
>> + HOST_WIDE_INT nonarg_acc_size = 0;
>> + bool only_calls = true;
>> + bool check_failed = false;
>> +
>> + int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index;
>> + for (gensum_param_access *acc = desc->accesses;
>> + acc;
>> + acc = acc->next_sibling)
>> + if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls,
>> + entry_bb_index))
>> + {
>> + check_failed = true;
>> + break;
>> + }
>> + if (check_failed)
>> + continue;
>> +
>> + if (only_calls)
>> + desc->locally_unused = true;
...specifically here.
>> +
>> + HOST_WIDE_INT cur_param_size
>> + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm)));
>> + HOST_WIDE_INT param_size_limit;
>> + if (!desc->by_ref || optimize_function_for_size_p (fun))
>> + param_size_limit = cur_param_size;
>> + else
>> + param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR)
>> + * cur_param_size);
>> + if (nonarg_acc_size > param_size_limit
>> + || (!desc->by_ref && nonarg_acc_size == param_size_limit))
>> + {
>> + disqualify_split_candidate (desc, "Would result into a too big set of"
>> + "replacements.");
>> + }
>> + else
>> + {
>> + /* create_parameter_descriptors makes sure unit sizes of all
>> + candidate parameters fit unsigned integers restricted to
>> + ISRA_ARG_SIZE_LIMIT. */
>> + desc->param_size_limit = param_size_limit / BITS_PER_UNIT;
>> + desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT;
>> + if (desc->split_candidate && desc->ptr_pt_count)
>> + {
>> + gcc_assert (desc->by_ref);
>> + check_pass_throughs = true;
>> + }
>> + }
>> + }
>
> ...disqualify_split_candidate should be a no-op in that case,
> because we've already disqualified the parameter for a different reason.
> So it looks like the main effect is instead to set up param_size_limit
> and nonarg_acc_size, the latter of which I assume is 0 when
> desc->locally_unused.
This is the only bit where you are wrong, param_size_limit is the type
size for aggregates and twice pointer size for pointers (well, actually
PARAM_IPA_SRA_PTR_GROWTH_FACTOR times the size of a pointer). Even for
locally unused parameters because we might "pull" some of them from
callees, when there are some. But it is not really relevant for the
problem you are facing.
>
> The reason for asking is that the final "else" says that we've already
> checked that param_size_limit is in range, but that's only true if
> desc->split_candidate. In particular:
>
> if (is_gimple_reg (parm)
> && !isra_track_scalar_param_local_uses (fun, node, parm, num,
> &scalar_call_uses))
> {
> if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, " is a scalar with only %i call uses\n",
> scalar_call_uses);
>
> desc->locally_unused = true;
> desc->call_uses = scalar_call_uses;
> }
>
> happens before:
>
> if (type_size == 0
> || type_size >= ISRA_ARG_SIZE_LIMIT)
> {
> if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, " not a candidate, has zero or huge size\n");
> continue;
> }
>
> An uninteresting case in which we truncate the value is:
>
> typedef unsigned int big_vec __attribute__((vector_size (0x20000)));
> void foo (big_vec x) {}
>
> and going further triggers an assert in the tree_to_uhwi (...):
>
> typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61)));
> void foo (omg_vec x) {}
>
> but as you can guess, SVE is the real reason I'm asking. ;-)
>
> FWIW, I tried changing it to:
>
> gensum_param_desc *desc = &(*param_descriptions)[desc_index];
> if (!desc->split_candidate)
> continue;
>
> and didn't get any test regressions (on aarch64-linux-gnu).
It is the correct thing to do, sorry for the breakage. I have to run
now but will prepare a patch tomorrow.
Thanks,
Martin
More information about the Gcc-patches
mailing list