Detect EAF flags in ipa-modref

Richard Biener rguenther@suse.de
Wed Nov 11 10:09:12 GMT 2020


On Tue, 10 Nov 2020, Jan Hubicka wrote:

> > > +  tree callee = gimple_call_fndecl (stmt);
> > > +  if (callee)
> > > +    {
> > > +      cgraph_node *node = cgraph_node::get (callee);
> > > +      modref_summary *summary = node ? get_modref_function_summary (node)
> > > +				: NULL;
> > > +
> > > +      if (summary && summary->arg_flags.length () > arg)
> > 
> > So could we make modref "transform" push this as fnspec attribute or
> > would that not really be an optimization?
> 
> It was my original plan to synthetize fnspecs, but I think it is not
> very good idea: we have the summary readily available and we can
> represent information that fnspecs can't
> (do not have artificial limits on number of parameters or counts)
> 
> I would preffer fnspecs to be used only for in-compiler declarations.

Fine, I was just curious...

> > > +
> > > +/* Analyze EAF flags for SSA name NAME.
> > > +   KNOWN_FLAGS is a cache for flags we already determined.
> > > +   DEPTH is a recursion depth used to make debug output prettier.  */
> > > +
> > > +static int
> > > +analyze_ssa_name_flags (tree name, vec<int> *known_flags, int depth)
> > 
> > C++ has references which makes the access to known_flags nicer ;)
> 
> Yay, will chang that :)
> > 
> > > +{
> > > +  imm_use_iterator ui;
> > > +  gimple *use_stmt;
> > > +  int flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED;
> > > +
> > > +  /* See if value is already computed.  */
> > > +  if ((*known_flags)[SSA_NAME_VERSION (name)])
> > > +    {
> > > +      /* Punt on cycles for now, so we do not need dataflow.  */
> > > +      if ((*known_flags)[SSA_NAME_VERSION (name)] == 1)
> > > +	{
> > > +	  if (dump_file)
> > > +	    fprintf (dump_file,
> > > +		     "%*sGiving up on a cycle in SSA graph\n", depth * 4, "");
> > > +	  return 0;
> > > +	}
> > > +      return (*known_flags)[SSA_NAME_VERSION (name)] - 2;
> > > +    }
> > > +  /* Recursion guard.  */
> > > +  (*known_flags)[SSA_NAME_VERSION (name)] = 1;
> > 
> > This also guards against multiple evaluations of the same stmts
> > but only in some cases?  Consider
> > 
> >   _1 = ..;
> >   _2 = _1 + _3;
> >   _4 = _1 + _5;
> >   _6 = _2 + _4;
> > 
> > where we visit _2 = and _4 = from _1 but from both are going
> > to visit _6.
> 
> Here we first push _6, then we go for _2 then for _1 evaluate _1,
> evalueate _2, go for _4 and evaluate _4, and evaluate _6.
> It is DFS and you need backward edge in DFS (comming from a PHI).

Hmm, but then we eventually evaluate _6 twice?

> 
> Cycles seems to somewhat matter for GCC: we do have a lot of functions
> that walk linked lists that we could track otherwise.
> > 
> > Maybe I'm blind but you're not limiting depth?  Guess that asks
> > for problems, esp. as you are recursing rather than using a
> > worklist or so?
> > 
> > I see you try to "optimize" the walk by only visiting def->use
> > links from parameters but then a RPO walk over all stmts would
> > be simpler iteration-wise ...
> We usually evaluate just small part of bigger functions (since we lose
> track quite easily after hitting first memory store).  My plan was to
> change this to actual dataflow once we have it well defined 
> (this means after discussing EAF flags with you and adding the logic to
> track callsites for true IPA pass that midly complicated things - for
> every ssa name I track callsite/arg pair where it is passed to
> either directly or indirectly.  Then this is translaed into call summary
> and used by IPA pass to compute final flags)
> 
> I guess I can add --param ipa-modref-walk-depth for now and handle
> dataflow incremntally?

Works for me.

> In particular I am not sure if I should just write iterated RPO myself
> or use tree-ssa-propagate.h (the second may be overkill).

tree-ssa-propagate.h is not to be used, it should DIE ;)

I guess you do want to iterate SSA cycles rather than BB cycles
so I suggest to re-surrect the SSA SCC discovery from the SCC
value-numbering (see tree-ssa-sccvn.c:DFS () on the gcc-8-branch)
which is non-recursive and micro-optimized.  Could put it
somewhere useful (tree-ssa.c?).

> > 
> > > +  if (dump_file)
> > > +    {
> > > +      fprintf (dump_file,
> > > +	       "%*sAnalyzing flags of ssa name: ", depth * 4, "");
> > > +      print_generic_expr (dump_file, name);
> > > +      fprintf (dump_file, "\n");
> > > +    }
> > > +
> > > +  FOR_EACH_IMM_USE_STMT (use_stmt, ui, name)
> > > +    {
> > > +      if (flags == 0)
> > > +	{
> > > +	  BREAK_FROM_IMM_USE_STMT (ui);
> > > +	}
> > > +      if (is_gimple_debug (use_stmt))
> > > +	continue;
> > > +      if (dump_file)
> > > +	{
> > > +	  fprintf (dump_file, "%*s  Analyzing stmt:", depth * 4, "");
> > > +	  print_gimple_stmt (dump_file, use_stmt, 0);
> > > +	}
> > > +
> > > +      /* Gimple return may load the return value.  */
> > > +      if (gimple_code (use_stmt) == GIMPLE_RETURN)
> > 
> >  if (greturn *ret = dyn_cast <greturn *> (use_stmt))
> > 
> > makes the as_a below not needed, similar for the other cases (it
> > also makes accesses cheaper, avoiding some checking code).
> 
> Looks cleaner indeed.
> > 
> > > +	{
> > > +	  if (memory_access_to (gimple_return_retval
> > > +				   (as_a <greturn *> (use_stmt)), name))
> > > +	    flags &= ~EAF_UNUSED;
> > > +	}
> > > +      /* Account for LHS store, arg loads and flags from callee function.  */
> > > +      else if (is_gimple_call (use_stmt))
> > > +	{
> > > +	  tree callee = gimple_call_fndecl (use_stmt);
> > > +
> > > +	  /* Recursion would require bit of propagation; give up for now.  */
> > > +	  if (callee && recursive_call_p (current_function_decl, callee))
> > > +	    flags = 0;
> > > +	  else
> > > +	    {
> > > +	      int ecf_flags = gimple_call_flags (use_stmt);
> > > +	      bool ignore_stores = ignore_stores_p (current_function_decl,
> > > +						    ecf_flags);
> > > +
> > > +	      /* Handle *name = func (...).  */
> > > +	      if (gimple_call_lhs (use_stmt)
> > > +		  && memory_access_to (gimple_call_lhs (use_stmt), name))
> > > +		flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > +
> > > +	      /* Handle all function parameters.  */
> > > +	      for (unsigned i = 0; i < gimple_call_num_args (use_stmt); i++)
> > > +		/* Name is directly passed to the callee.  */
> > > +		if (gimple_call_arg (use_stmt, i) == name)
> > > +		  {
> > > +		    if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
> > > +		      flags &= ignore_stores
> > > +			       ? 0
> > > +			       : call_lhs_flags (use_stmt, i,
> > > +						 known_flags, depth);
> > > +		    else
> > > +		      {
> > > +			int call_flags = gimple_call_arg_flags (as_a <gcall *>
> > > +								 (use_stmt), i);
> > > +			if (ignore_stores)
> > > +			  call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE;
> > > +			else
> > > +			  call_flags &= call_lhs_flags (use_stmt, i,
> > > +							known_flags, depth);
> > > +
> > > +			flags &= call_flags;
> > > +		      }
> > > +		  }
> > > +		/* Name is dereferenced and passed to a callee.  */
> > > +		else if (memory_access_to (gimple_call_arg (use_stmt, i), name))
> > > +		  {
> > > +		    if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
> > > +		      flags &= ~EAF_UNUSED;
> > > +		    else
> > > +		      flags &= deref_flags (gimple_call_arg_flags
> > > +						(as_a <gcall *> (use_stmt), i),
> > > +					    ignore_stores);
> > > +		    if (!ignore_stores)
> > > +		      flags &= call_lhs_flags (use_stmt, i, known_flags,
> > > +					       depth);
> > > +		  }
> > 
> > Hmm, you forget gimple_call_chain at least which is at least
> > argument passing?  Possibly the chain argument is never a function
> > parameter but how do you know... (partial inlining?)
> 
> Ah, right. I forgot about chain.
> I wil ladd it.
> > 
> > Then there's gimple_call_fn for indirect function calls to a
> > function parameter?
> 
> Well, it is never load or store, so I do not really care about it.
> for
> 
> void test (int (*foo)())
> {
>   foo();
> }
> 
> I should be able to derive EAF_UNUSED.
> 
> void test (int (**foo)())
> {
>   (*foo)();
> }
> 
> I will see sparate load.
> > 
> > I guess it would be nice to amend the gimple_walk_ stuff to have
> > a SSA name callback where the tree and the SSA use is passed.
> > Well, for another time.
> > 
> > > +	    }
> > > +	  /* Only NOCLOBBER or DIRECT flags alone are not useful (see comments
> > > +	     in tree-ssa-alias.c).  Give up earlier.  */
> > > +	  if ((flags & ~(EAF_DIRECT | EAF_NOCLOBBER)) == 0)
> > > +	    flags = 0;
> > > +	}
> > > +      else if (gimple_assign_load_p (use_stmt))
> > > +	{
> > > +	  /* Memory to memory copy.  */
> > > +	  if (gimple_store_p (use_stmt))
> > > +	    {
> > > +	      /* Handle *name = *exp.  */
> > > +	      if (memory_access_to (gimple_assign_lhs (use_stmt), name))
> > > +		flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > +
> > > +	      /* Handle *lhs = *name.
> > > +
> > > +		 We do not track memory locations, so assume that value
> > > +		 is used arbitrarily.  */
> > > +	      if (memory_access_to (gimple_assign_rhs1 (use_stmt), name))
> > > +		flags = 0;
> > > +	    }
> > > +	  /* Handle lhs = *name.  */
> > > +	  else if (memory_access_to (gimple_assign_rhs1 (use_stmt), name))
> > > +	    flags &= deref_flags (analyze_ssa_name_flags
> > > +				      (gimple_assign_lhs (use_stmt),
> > > +				       known_flags, depth + 1), false);
> > > +	}
> > > +      else if (gimple_store_p (use_stmt))
> > > +	{
> > > +	  /* Handle *lhs = name.  */
> > > +	  if (is_gimple_assign (use_stmt)
> > > +	      && gimple_assign_rhs1 (use_stmt) == name)
> > > +	    {
> > > +	      if (dump_file)
> > > +		fprintf (dump_file, "%*s  ssa name saved to memory\n",
> > > +			 depth * 4, "");
> > > +	      flags = 0;
> > > +	    }
> > > +	  /* Handle *name = exp.  */
> > > +	  else if (is_gimple_assign (use_stmt)
> > > +		   && memory_access_to (gimple_assign_lhs (use_stmt), name))
> > > +	    flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > +	  /* ASM statements etc.  */
> > > +	  else if (!is_gimple_assign (use_stmt))
> > > +	    {
> > > +	      if (dump_file)
> > > +		fprintf (dump_file, "%*s  Unhandled store\n",
> > > +			 depth * 4, "");
> > > +	      flags = 0;
> > > +	    }
> > > +	}
> > > +      else if (is_gimple_assign (use_stmt))
> > > +	{
> > > +	  enum tree_code code = gimple_assign_rhs_code (use_stmt);
> > > +
> > > +	  /* See if operation is a merge as considered by
> > > +	     tree-ssa-structalias.c:find_func_aliases.  */
> > > +	  if (!truth_value_p (code)
> > > +	      && code != POINTER_DIFF_EXPR
> > > +	      && (code != POINTER_PLUS_EXPR
> > > +		  || gimple_assign_rhs1 (use_stmt) == name))
> > > +	    flags &= analyze_ssa_name_flags
> > > +			       (gimple_assign_lhs (use_stmt), known_flags,
> > > +				depth + 1);
> > > +	}
> > > +      else if (gimple_code (use_stmt) == GIMPLE_PHI)
> > > +	{
> > > +	  flags &= analyze_ssa_name_flags
> > > +			     (gimple_phi_result (use_stmt), known_flags,
> > > +			      depth + 1);
> > > +	}
> > > +      /* Conditions are not considered escape points
> > > +	 by tree-ssa-structalias.  */
> > > +      else if (gimple_code (use_stmt) == GIMPLE_COND)
> > > +	;
> > > +      else
> > > +	{
> > > +	  if (dump_file)
> > > +	    fprintf (dump_file, "%*s  Unhandled stmt\n", depth * 4, "");
> > > +	  flags = 0;
> > > +	}
> > > +
> > > +      if (dump_file)
> > > +	{
> > > +	  fprintf (dump_file, "%*s  current flags of ", depth * 4, "");
> > > +	  print_generic_expr (dump_file, name);
> > > +	  dump_eaf_flags (dump_file, flags);
> > > +	}
> > > +    }
> > > +  if (dump_file)
> > > +    {
> > > +      fprintf (dump_file, "%*sflags of ssa name ", depth * 4, "");
> > > +      print_generic_expr (dump_file, name);
> > > +      dump_eaf_flags (dump_file, flags);
> > > +    }
> > > +  (*known_flags)[SSA_NAME_VERSION (name)] = flags + 2;
> > > +  return flags;
> > > +}
> > > +
> > > +/* Determine EAF flags for function parameters.  */
> > > +
> > > +static void
> > > +analyze_parms (modref_summary *summary)
> > > +{
> > > +  unsigned int parm_index = 0;
> > > +  unsigned int count = 0;
> > > +
> > > +  for (tree parm = DECL_ARGUMENTS (current_function_decl); parm;
> > > +       parm = TREE_CHAIN (parm))
> > > +    count++;
> > > +
> > > +  if (!count)
> > > +    return;
> > > +
> > > +  auto_vec<int> known_flags;
> > > +  known_flags.safe_grow_cleared (num_ssa_names);
> > 
> > , true for exact reserve.  Could be space optimized by not using
> > auto_vec<int> but auto_vec<usigned char>?
> 
> I think there is not that much memory wasted, but indeed chars will be
> more effective.
> > 
> > > +
> > > +  for (tree parm = DECL_ARGUMENTS (current_function_decl); parm; parm_index++,
> > > +       parm = TREE_CHAIN (parm))
> > > +    {
> > > +      tree name = ssa_default_def (cfun, parm);
> > > +      if (!name)
> > > +	continue;
> > 
> > looks like the vec might be quite sparse ...
> > 
> > > +      int flags = analyze_ssa_name_flags (name, &known_flags, 0);
> > > +
> > > +      if (flags)
> > > +	{
> > > +	  if (parm_index >= summary->arg_flags.length ())
> > > +	    summary->arg_flags.safe_grow_cleared (count);
> > 
> > you want , true for exact reserve.
> > 
> > > +	  summary->arg_flags[parm_index] = flags;
> > 
> > maybe this as well - does it make sense to skip !is_gimple_reg_type
> > params in the counting?  Guess it makes lookup too complicated.
> > But we do have testcases with >30000 parameters ...
> 
> Well, the index needs to be actual index all call argument.
> As said, i did not see any noticeable modref memory use increase at WPA
> (I watched) since it is at most 1 byte for parameter in case something
> got tracked.
> 
> Thanks for review!
> 
> Honza
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


More information about the Gcc mailing list