[PATCH 3/4] New IPA-SRA implementation
Richard Sandiford
richard.sandiford@arm.com
Sat Sep 21 14:26:00 GMT 2019
Hi,
Thanks for doing this.
Martin Jambor <mjambor@suse.cz> writes:
> +/* Analyze function body scan results stored in param_accesses and
> + param_accesses, detect possible transformations and store information of
> + those in function summary. NODE, FUN and IFS are all various structures
> + describing the currently analyzed function. */
> +
> +static void
> +process_scan_results (cgraph_node *node, struct function *fun,
> + isra_func_summary *ifs,
> + vec<gensum_param_desc> *param_descriptions)
> +{
> + bool check_pass_throughs = false;
> + bool dereferences_propagated = false;
> + tree parm = DECL_ARGUMENTS (node->decl);
> + unsigned param_count = param_descriptions->length();
> +
> + for (unsigned desc_index = 0;
> + desc_index < param_count;
> + desc_index++, parm = DECL_CHAIN (parm))
> + {
> + gensum_param_desc *desc = &(*param_descriptions)[desc_index];
> + if (!desc->locally_unused && !desc->split_candidate)
> + continue;
I'm jumping in the middle without working through the whole pass,
so this is probably a daft question sorry, but: what is this loop
required to do when:
!desc->split_candidate && desc->locally_unused
? AFAICT...
> +
> + if (flag_checking)
> + isra_verify_access_tree (desc->accesses);
> +
> + if (!dereferences_propagated
> + && desc->by_ref
> + && desc->accesses)
> + {
> + propagate_dereference_distances (fun);
> + dereferences_propagated = true;
> + }
> +
> + HOST_WIDE_INT nonarg_acc_size = 0;
> + bool only_calls = true;
> + bool check_failed = false;
> +
> + int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index;
> + for (gensum_param_access *acc = desc->accesses;
> + acc;
> + acc = acc->next_sibling)
> + if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls,
> + entry_bb_index))
> + {
> + check_failed = true;
> + break;
> + }
> + if (check_failed)
> + continue;
> +
> + if (only_calls)
> + desc->locally_unused = true;
> +
> + HOST_WIDE_INT cur_param_size
> + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm)));
> + HOST_WIDE_INT param_size_limit;
> + if (!desc->by_ref || optimize_function_for_size_p (fun))
> + param_size_limit = cur_param_size;
> + else
> + param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR)
> + * cur_param_size);
> + if (nonarg_acc_size > param_size_limit
> + || (!desc->by_ref && nonarg_acc_size == param_size_limit))
> + {
> + disqualify_split_candidate (desc, "Would result into a too big set of"
> + "replacements.");
> + }
> + else
> + {
> + /* create_parameter_descriptors makes sure unit sizes of all
> + candidate parameters fit unsigned integers restricted to
> + ISRA_ARG_SIZE_LIMIT. */
> + desc->param_size_limit = param_size_limit / BITS_PER_UNIT;
> + desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT;
> + if (desc->split_candidate && desc->ptr_pt_count)
> + {
> + gcc_assert (desc->by_ref);
> + check_pass_throughs = true;
> + }
> + }
> + }
...disqualify_split_candidate should be a no-op in that case,
because we've already disqualified the parameter for a different reason.
So it looks like the main effect is instead to set up param_size_limit
and nonarg_acc_size, the latter of which I assume is 0 when
desc->locally_unused.
The reason for asking is that the final "else" says that we've already
checked that param_size_limit is in range, but that's only true if
desc->split_candidate. In particular:
if (is_gimple_reg (parm)
&& !isra_track_scalar_param_local_uses (fun, node, parm, num,
&scalar_call_uses))
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, " is a scalar with only %i call uses\n",
scalar_call_uses);
desc->locally_unused = true;
desc->call_uses = scalar_call_uses;
}
happens before:
if (type_size == 0
|| type_size >= ISRA_ARG_SIZE_LIMIT)
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, " not a candidate, has zero or huge size\n");
continue;
}
An uninteresting case in which we truncate the value is:
typedef unsigned int big_vec __attribute__((vector_size (0x20000)));
void foo (big_vec x) {}
and going further triggers an assert in the tree_to_uhwi (...):
typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61)));
void foo (omg_vec x) {}
but as you can guess, SVE is the real reason I'm asking. ;-)
FWIW, I tried changing it to:
gensum_param_desc *desc = &(*param_descriptions)[desc_index];
if (!desc->split_candidate)
continue;
and didn't get any test regressions (on aarch64-linux-gnu).
Thanks,
Richard
More information about the Gcc-patches
mailing list