[PATCH 3/4] New IPA-SRA implementation

Martin Jambor mjambor@suse.cz
Tue Sep 24 17:43:00 GMT 2019


Hi,

sorry for replying so late, I still haven't recovered from two weeks of
traveling and conferences.

On Sat, Sep 21 2019, Richard Sandiford wrote:
> Hi,
>
> Thanks for doing this.
>
> Martin Jambor <mjambor@suse.cz> writes:
>> +/* Analyze function body scan results stored in param_accesses and
>> +   param_accesses, detect possible transformations and store information of
>> +   those in function summary.  NODE, FUN and IFS are all various structures
>> +   describing the currently analyzed function.  */
>> +
>> +static void
>> +process_scan_results (cgraph_node *node, struct function *fun,
>> +		      isra_func_summary *ifs,
>> +		      vec<gensum_param_desc> *param_descriptions)
>> +{
>> +  bool check_pass_throughs = false;
>> +  bool dereferences_propagated = false;
>> +  tree parm = DECL_ARGUMENTS (node->decl);
>> +  unsigned param_count = param_descriptions->length();
>> +
>> +  for (unsigned desc_index = 0;
>> +       desc_index < param_count;
>> +       desc_index++, parm = DECL_CHAIN (parm))
>> +    {
>> +      gensum_param_desc *desc = &(*param_descriptions)[desc_index];
>> +      if (!desc->locally_unused && !desc->split_candidate)
>> +	continue;
>
> I'm jumping in the middle without working through the whole pass,
> so this is probably a daft question sorry, but: what is this loop
> required to do when:
>
>   !desc->split_candidate && desc->locally_unused

You have figured out correctly that this is a thinko.  I meant not to
continue for non-register-types which might not be used locally but
their locally_unused flag is only set a few lines below...

>
> ?  AFAICT...
>
>> +
>> +      if (flag_checking)
>> +	isra_verify_access_tree (desc->accesses);
>> +
>> +      if (!dereferences_propagated
>> +	  && desc->by_ref
>> +	  && desc->accesses)
>> +	{
>> +	  propagate_dereference_distances (fun);
>> +	  dereferences_propagated = true;
>> +	}
>> +
>> +      HOST_WIDE_INT nonarg_acc_size = 0;
>> +      bool only_calls = true;
>> +      bool check_failed = false;
>> +
>> +      int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index;
>> +      for (gensum_param_access *acc = desc->accesses;
>> +	   acc;
>> +	   acc = acc->next_sibling)
>> +	if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls,
>> +				 entry_bb_index))
>> +	  {
>> +	    check_failed = true;
>> +	    break;
>> +	  }
>> +      if (check_failed)
>> +	continue;
>> +
>> +      if (only_calls)
>> +	desc->locally_unused = true;

...specifically here.

>> +
>> +      HOST_WIDE_INT cur_param_size
>> +	= tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm)));
>> +      HOST_WIDE_INT param_size_limit;
>> +      if (!desc->by_ref || optimize_function_for_size_p (fun))
>> +	param_size_limit = cur_param_size;
>> +      else
>> +	param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR)
>> +			   * cur_param_size);
>> +      if (nonarg_acc_size > param_size_limit
>> +	  || (!desc->by_ref && nonarg_acc_size == param_size_limit))
>> +	{
>> +	  disqualify_split_candidate (desc, "Would result into a too big set of"
>> +				      "replacements.");
>> +	}
>> +      else
>> +	{
>> +	  /* create_parameter_descriptors makes sure unit sizes of all
>> +	     candidate parameters fit unsigned integers restricted to
>> +	     ISRA_ARG_SIZE_LIMIT.  */
>> +	  desc->param_size_limit = param_size_limit / BITS_PER_UNIT;
>> +	  desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT;
>> +	  if (desc->split_candidate && desc->ptr_pt_count)
>> +	    {
>> +	      gcc_assert (desc->by_ref);
>> +	      check_pass_throughs = true;
>> +	    }
>> +	}
>> +    }
>
> ...disqualify_split_candidate should be a no-op in that case,
> because we've already disqualified the parameter for a different reason.
> So it looks like the main effect is instead to set up param_size_limit
> and nonarg_acc_size, the latter of which I assume is 0 when
> desc->locally_unused.

This is the only bit where you are wrong, param_size_limit is the type
size for aggregates and twice pointer size for pointers (well, actually
PARAM_IPA_SRA_PTR_GROWTH_FACTOR times the size of a pointer).  Even for
locally unused parameters because we might "pull" some of them from
callees, when there are some.  But it is not really relevant for the
problem you are facing.

>
> The reason for asking is that the final "else" says that we've already
> checked that param_size_limit is in range, but that's only true if
> desc->split_candidate.  In particular:
>
>       if (is_gimple_reg (parm)
> 	  && !isra_track_scalar_param_local_uses (fun, node, parm, num,
> 						  &scalar_call_uses))
> 	{
> 	  if (dump_file && (dump_flags & TDF_DETAILS))
> 	    fprintf (dump_file, " is a scalar with only %i call uses\n",
> 		     scalar_call_uses);
>
> 	  desc->locally_unused = true;
> 	  desc->call_uses = scalar_call_uses;
> 	}
>
> happens before:
>
>       if (type_size == 0
> 	  || type_size >= ISRA_ARG_SIZE_LIMIT)
> 	{
> 	  if (dump_file && (dump_flags & TDF_DETAILS))
> 	    fprintf (dump_file, " not a candidate, has zero or huge size\n");
> 	  continue;
> 	}
>
> An uninteresting case in which we truncate the value is:
>
>     typedef unsigned int big_vec __attribute__((vector_size (0x20000)));
>     void foo (big_vec x) {}
>
> and going further triggers an assert in the tree_to_uhwi (...):
>
>     typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61)));
>     void foo (omg_vec x) {}
>
> but as you can guess, SVE is the real reason I'm asking. ;-)
>
> FWIW, I tried changing it to:
>
>       gensum_param_desc *desc = &(*param_descriptions)[desc_index];
>       if (!desc->split_candidate)
> 	continue;
>
> and didn't get any test regressions (on aarch64-linux-gnu).


It is the correct thing to do, sorry for the breakage.  I have to run
now but will prepare a patch tomorrow.

Thanks,

Martin



More information about the Gcc-patches mailing list