[PATCH 3/4] New IPA-SRA implementation

Richard Sandiford richard.sandiford@arm.com
Sat Sep 21 14:26:00 GMT 2019


Hi,

Thanks for doing this.

Martin Jambor <mjambor@suse.cz> writes:
> +/* Analyze function body scan results stored in param_accesses and
> +   param_accesses, detect possible transformations and store information of
> +   those in function summary.  NODE, FUN and IFS are all various structures
> +   describing the currently analyzed function.  */
> +
> +static void
> +process_scan_results (cgraph_node *node, struct function *fun,
> +		      isra_func_summary *ifs,
> +		      vec<gensum_param_desc> *param_descriptions)
> +{
> +  bool check_pass_throughs = false;
> +  bool dereferences_propagated = false;
> +  tree parm = DECL_ARGUMENTS (node->decl);
> +  unsigned param_count = param_descriptions->length();
> +
> +  for (unsigned desc_index = 0;
> +       desc_index < param_count;
> +       desc_index++, parm = DECL_CHAIN (parm))
> +    {
> +      gensum_param_desc *desc = &(*param_descriptions)[desc_index];
> +      if (!desc->locally_unused && !desc->split_candidate)
> +	continue;

I'm jumping in the middle without working through the whole pass,
so this is probably a daft question sorry, but: what is this loop
required to do when:

  !desc->split_candidate && desc->locally_unused

?  AFAICT...

> +
> +      if (flag_checking)
> +	isra_verify_access_tree (desc->accesses);
> +
> +      if (!dereferences_propagated
> +	  && desc->by_ref
> +	  && desc->accesses)
> +	{
> +	  propagate_dereference_distances (fun);
> +	  dereferences_propagated = true;
> +	}
> +
> +      HOST_WIDE_INT nonarg_acc_size = 0;
> +      bool only_calls = true;
> +      bool check_failed = false;
> +
> +      int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index;
> +      for (gensum_param_access *acc = desc->accesses;
> +	   acc;
> +	   acc = acc->next_sibling)
> +	if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls,
> +				 entry_bb_index))
> +	  {
> +	    check_failed = true;
> +	    break;
> +	  }
> +      if (check_failed)
> +	continue;
> +
> +      if (only_calls)
> +	desc->locally_unused = true;
> +
> +      HOST_WIDE_INT cur_param_size
> +	= tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm)));
> +      HOST_WIDE_INT param_size_limit;
> +      if (!desc->by_ref || optimize_function_for_size_p (fun))
> +	param_size_limit = cur_param_size;
> +      else
> +	param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR)
> +			   * cur_param_size);
> +      if (nonarg_acc_size > param_size_limit
> +	  || (!desc->by_ref && nonarg_acc_size == param_size_limit))
> +	{
> +	  disqualify_split_candidate (desc, "Would result into a too big set of"
> +				      "replacements.");
> +	}
> +      else
> +	{
> +	  /* create_parameter_descriptors makes sure unit sizes of all
> +	     candidate parameters fit unsigned integers restricted to
> +	     ISRA_ARG_SIZE_LIMIT.  */
> +	  desc->param_size_limit = param_size_limit / BITS_PER_UNIT;
> +	  desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT;
> +	  if (desc->split_candidate && desc->ptr_pt_count)
> +	    {
> +	      gcc_assert (desc->by_ref);
> +	      check_pass_throughs = true;
> +	    }
> +	}
> +    }

...disqualify_split_candidate should be a no-op in that case,
because we've already disqualified the parameter for a different reason.
So it looks like the main effect is instead to set up param_size_limit
and nonarg_acc_size, the latter of which I assume is 0 when
desc->locally_unused.

The reason for asking is that the final "else" says that we've already
checked that param_size_limit is in range, but that's only true if
desc->split_candidate.  In particular:

      if (is_gimple_reg (parm)
	  && !isra_track_scalar_param_local_uses (fun, node, parm, num,
						  &scalar_call_uses))
	{
	  if (dump_file && (dump_flags & TDF_DETAILS))
	    fprintf (dump_file, " is a scalar with only %i call uses\n",
		     scalar_call_uses);

	  desc->locally_unused = true;
	  desc->call_uses = scalar_call_uses;
	}

happens before:

      if (type_size == 0
	  || type_size >= ISRA_ARG_SIZE_LIMIT)
	{
	  if (dump_file && (dump_flags & TDF_DETAILS))
	    fprintf (dump_file, " not a candidate, has zero or huge size\n");
	  continue;
	}

An uninteresting case in which we truncate the value is:

    typedef unsigned int big_vec __attribute__((vector_size (0x20000)));
    void foo (big_vec x) {}

and going further triggers an assert in the tree_to_uhwi (...):

    typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61)));
    void foo (omg_vec x) {}

but as you can guess, SVE is the real reason I'm asking. ;-)

FWIW, I tried changing it to:

      gensum_param_desc *desc = &(*param_descriptions)[desc_index];
      if (!desc->split_candidate)
	continue;

and didn't get any test regressions (on aarch64-linux-gnu).

Thanks,
Richard



More information about the Gcc-patches mailing list