[PATCH]: Replace backend dataflow

Ian Lance Taylor ian@airs.com
Tue Jan 3 20:20:00 GMT 2006


OK, I finally read through this patch, though I only skimmed most of
df-problems.c.  I also read Ken's followup which incorporates Zdenek's
changes.

I recommend that you look at my comments, and adjust the code.  Then
repost the patch.  Then if nobody has any comments for, say, three
days, please commit it.

Thanks!

Ian

Daniel Berlin <dberlin@dberlin.org> writes:

> +/* Allocation for dataflow support routines.
> +   Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005

2006 now, of course.

> +DF_SET_BLOCKS is an optional call used to define a region of the
> +program on which the analysis will be performed.  The normal case is
> +to analyze the entire program and no call to df_set_blocks is made.

Should this say "the entire function" rather than "the entire
program?"  Or does it really scan the entire program somehow?

> +Different optimizations have different needs.  Ultimately, only
> +register allocation and schedulers should be using the bitmaps and the
> +rest of the backend should be upgraded to using and maintaining the
> +linked information.  

What bitmaps does this refer to?  Is this paragraph essentially a
FIXME comment, or is it trying to say something about the dataflow
routines are used?


> +While incremental bitmaps are not worthwhile to maintian, incremental
> +chains may be perfectly reasonable.  The fastest way to build chains
> +from scratch or after significant modifications is to build reaching
> +definitions (RD) and build the chains from this.

Typo: "maintian" for "maintain" in the first line.

> +These are linked into a variety of lists; namely reg-def, reg-use,
> +insn-def, insn-use, def-use, and use-def lists.  For example,
> +the reg-def lists contain all the refs that define a given register
> +while the insn-use lists contain all the refs used by an insn.

Do the reg-def lists contain "refs" or "defs?"

> +/****************************************************************************/
> +/* Functions to create, destroy and manipulate an instance of df.           */
> +/****************************************************************************/

The usual convention in gcc is not to use this kind of block comment,
but to instead simply group things in pages (as you have done).  Not a
big deal, though.

> +  struct df * df = xcalloc (1, sizeof (struct df));

There should not be a space after the '*'.  This needs to be fixed in
many places.  I won't call the others do, but you should probably do
something like M-%<space>*<space><return><space>*<return>.

> +/* Set the blocks that are to be considered for analysis.  If this is
> +   not called or is called with null, the entire program in
> +   analyzed.  */

Again, "entire program" or "entire function?"

> +void
> +df_finish (struct df *df)
> +{
> +  int i;
> +

> +  for (i=0; i<df->num_problems_defined; i++)

Needs spaces around the operators.

> +    (*df->problems_in_order[i]->problem->free_fun) (df->problems_in_order[i]); 
> +
> +  free (df);
> +}

Why don't you have to free the bitmaps?  And what about the def and
ref information?


> +/*
> +  This function will perform iterative bitvector dataflow described by
> +  DATAFLOW, producing the in and out sets.  Only the part of the cfg
> +  induced by blocks in DATAFLOW->order is taken into account.
> +
> +  SINGLE_PASS is true if you just want to make one pass over the blocks. 
> + */

The /* and */ should be on the lines with the rest of the comment, as
usual.

> +  (*dataflow->problem->init_fun)(dataflow, blocks_to_init);

Please add a space between ')' and '('.

> +    (*dflow->problem->alloc_fun)(dflow, blocks_to_scan);

Likewise.  I won't mention the other occurrences where a space is
needed.

> +  for (i=0; i<n_blocks; i++)

Space around operators.

> +  /* Do not do DF_SCAN */

Please put a period at the end of the sentence.  Actually I don't
understand this comment anyhow.

> +  for (i=1; i<df->num_problems_defined; i++)

Please add spaces around the operators.

> +  return (struct df_scan_bb_info *)dflow->block_info[index];

Please add a space after the ')'.

> +   This will not work properly for RD and RU since these problems
> +   depend on the bit vectors being ordered in a particular manner and
> +   that will be too expensive to keep up to data during simple
> +   modifications.

Is it worth checking for this case?

> +  for (i=0; i<n_blocks; i++) 

Please add spaces around the operators.  This happens more times--I
won't mention it further.

> +    {
> +      basic_block bb = BASIC_BLOCK (blocks_in_postorder[i]);
> +      edge e;
> +      edge_iterator ei;
> +      FOR_EACH_EDGE (e, ei, bb->preds)
> +	bitmap_set_bit (blocks_in_fringe, e->src->index);
> +      FOR_EACH_EDGE (e, ei, bb->succs)
> +	bitmap_set_bit (blocks_in_fringe, e->dest->index);
> +      bitmap_set_bit (blocks, bb->index);
> +      bitmap_set_bit (blocks_in_fringe, bb->index);
> +    }

Please add a blank line after the declaration of ei.

> +  /* Do not do DF_SCAN */

Please add a period after the comment.

> +/* Convienence call to df_analyze_simple_changes_some_blocks.  */

Typo: "Convenience."

> +/* Called from the rtl_compact_blocks to reorganize the problems basic
> +   block info.  */

What does this mean?  As far as I can see df_compact_blocks is not
called from anywhere?

> +void 
> +df_compact_blocks (struct df *df)
> +{
> +  int i, p;
> +  basic_block bb;
> +  void ** problem_temps;
> +  int size = last_basic_block * sizeof (void *);
> +  problem_temps = xmalloc (size);
> +
> +  for (p = 0; p < df->num_problems_defined; p++)
> +    {
> +      struct dataflow * dflow = df->problems_in_order[p];
> +      if (*dflow->problem->free_bb_fun)
> +	{
> +	  df_grow_bb_info (dflow);
> +	  memcpy (problem_temps, dflow->block_info, size);
> +
> +	  /* Copy the bb info from the problem tmps to the proper
> +	     place in the block_info vector.  Null out the copied
> +	     item.  */
> +	  i = NUM_FIXED_BLOCKS;
> +	  FOR_EACH_BB (bb) 
> +	    {
> +	      df_set_bb_info (dflow, i, problem_temps[bb->index]);
> +	      problem_temps[bb->index] = NULL;
> +	      i++;
> +	    }
> +	  memset (dflow->block_info + i, 0, 
> +		  (last_basic_block - i) * sizeof (void *));
> +
> +	  /* Free any block infos that were not copied (and NULLed).
> +	     These are from orphaned blocks.  */
> +	  for (i = NUM_FIXED_BLOCKS; i<last_basic_block; i++)
> +	    {
> +	      if (problem_temps[i])
> +		(*dflow->problem->free_bb_fun)(dflow, problem_temps[i]);
> +	    }
> +	}
> +    }
> +
> +  free (problem_temps);
> +
> +  i = NUM_FIXED_BLOCKS;
> +  FOR_EACH_BB (bb) 
> +    {
> +      BASIC_BLOCK (i) = bb;
> +      bb->index = i;
> +      i++;
> +    }
> +
> +  gcc_assert (i == n_basic_blocks);
> +
> +  for (; i<last_basic_block; i++)
> +    BASIC_BLOCK (i) = NULL;
> +}

Should we just call the existing compact_blocks routine here?


> +/* Return true if REG is referenced in INSN, zero otherwise.  */ 
> +
> +struct df_ref *
> +df_find_use (struct df *df, rtx insn, rtx reg)

This function does not return 'true', so the comment is wrong.  Please
fix it.

> +void
> +df_dump (struct df *df, FILE *file)
> +{
> +  int i;
> +  if (! df || ! file)
> +    return;

Please add a blank line after 'int i;'.

> +
> +  fprintf (file, "\n\n%s\n", current_function_name());

Please add a space before the second '('.

> +      ref=ref->next_ref;

Please add a space around the operator.

> +      ref=ref->next_reg;

Likewise.

> +void
> +df_insn_debug (struct df *df, rtx insn, bool follow_chain, FILE *file)
> +{
> +  unsigned int uid;
> +  int bbi;
> +
> +  uid = INSN_UID (insn);
> +
> +  if (DF_INSN_UID_DEFS(df, uid))
> +    bbi = DF_REF_BBNO (DF_INSN_UID_DEFS(df, uid));
> +  else if (DF_INSN_UID_USES(df, uid))
> +    bbi = DF_REF_BBNO (DF_INSN_UID_USES(df, uid));
> +  else
> +    bbi = -1;
> +
> +  fprintf (file, "insn %d bb %d luid %d defs ",
> +	   uid, bbi, DF_INSN_LUID (df, insn));
> +
> +  df_refs_chain_dump (df, DF_INSN_UID_DEFS(df, uid), follow_chain, file);
> +  fprintf (file, " defs ");
> +  df_refs_chain_dump (df, DF_INSN_UID_USES(df, uid), follow_chain, file);
> +  fprintf (file, "\n");
> +}

Please add spaces after uses of DF_INSN_UID_DEFS and DF_INSN_UID_USES.
This happens more later, and should be fixed in all cases.


> Index: df-scan.c
> ===================================================================
> --- df-scan.c	(revision 0)
> +++ df-scan.c	(revision 0)
> @@ -0,0 +1,1952 @@
> +/* ToDO: We need to go back and add the warning messages about code
> +   moved across setjmp.  */

Please use FIXME or ??? or XXX (I can't remember if we fixed on one
style) and don't put it at the top of the file--put it after the
initial large comment.

> +#include "timevar.h"
> +#include "df.h"
> +#ifndef HAVE_epilogue
> +#define HAVE_epilogue 0
> +#endif

Please put a blank line between the #include and #ifndef.

> +int df_state;            /* Indicates where we are in the compilation.  */

We usually prefer comments on separate lines.  I don't see any reason
to put it on the same line here.

> +/* The bitmap_obstack is used to hold some static variables that
> +should not be reset after each function is compiled.  */

Please add three spaces at the start of the second comment line.  This
appears to occur several more times.

> +static HARD_REG_SET elim_reg_set;

This variable should have a comment.

> +/*  The number of def allowed for any pseudo before sparce techniques
> +are used. */

This comment doesn't seem to have a variable.

> +/****************************************************************************/
> +/* SCANNING DATAFLOW PROBLEM                                                */
> +/*                                                                          */
> +/* There are several ways in which scanning looks just like the other       */
> +/* dataflow problems.  It shares the all the mechanisms for local info      */
> +/* as well as basic block info.  Where it differs is when and how often     */
> +/* it gets run.  It also has no need for the iterative solver.              */
> +/*                                                                          */
> +/****************************************************************************/

Please don't use this style of block comment as it is a real pain to
change it while keeping the formatting.  Please make it a single
comment as we usually do.

> +/*   if (last_basic_block > 200) */
> +/*     block_size = (last_basic_block + 20)/2; */
> +/*   else */
> +/*     block_size = 100; */

Seems like we should remove these lines, as well as the similar cases
below in this function.  Or we should have a single comment.

> +static void 
> +df_grow_reg_info (struct dataflow * dflow, struct df_ref_info * ref_info)
> +{
> +  unsigned int max_reg = max_reg_num ();
> +  unsigned int new_size = max_reg;
> +  struct df_scan_problem_data * problem_data =
> +    (struct df_scan_problem_data *)dflow->problem_data;

Please add a space after the ')'.

> +  unsigned int i;
> +
> +  if (ref_info->regs_size < new_size)
> +    {
> +      new_size += new_size/4;

Please add spaces around the '/'.

> +      ref_info->regs = xrealloc (ref_info->regs, 
> +				 new_size * sizeof (struct df_reg_info*));
> +      ref_info->regs_size = new_size;
> +    }
> +
> +  for (i=ref_info->regs_inited; i<max_reg; i++)

Please add spaces around the operators.

> +/* Grow the ref information.  If the current size is less than the
> +   number of instructions, grow to 25% more than the number of
> +   instructions.  */
> +static void 
> +df_grow_insn_info (struct df * df)
> +{
> +  unsigned int new_size = get_max_uid () + 1;
> +  if (df->insns_size < new_size)
> +    {
> +      new_size += new_size/4;

Please add spaces around the '/'.  There are similar cases below which
I won't mention.

> +  return use;
> +}
> +/* Return a DEF for register REGNO.  */
> +static rtx 
> +df_reg_def_gen (bool set, unsigned int regno)
> +{

Please add a blank line before the comment.

> +/* Create new references of type DF_REF_TYPE for each part of register REG
> +   at address LOC within INSN of BB.  */
> +static void
> +df_ref_record (struct dataflow *dflow, rtx reg, rtx *loc, 
> +	       basic_block bb, rtx insn, 
> +	       enum df_ref_type ref_type, 
> +	       enum df_ref_flags ref_flags, 
> +	       bool record_live)
> +{
> +  unsigned int regno;
> +  struct df * df = dflow->df;
> +
> +  gcc_assert (REG_P (reg) || GET_CODE (reg) == SUBREG);
> +
> +  /* For the reg allocator we are interested in some SUBREG rtx's, but not
> +     all.  Notably only those representing a word extraction from a multi-word
> +     reg.  As written in the docu those should have the form
> +     (subreg:SI (reg:M A) N), with size(SImode) > size(Mmode).
> +     XXX Is that true?  We could also use the global word_mode variable.  */
> +  if ((df->flags & DF_SUBREGS) == 0
> +      && GET_CODE (reg) == SUBREG
> +      && (GET_MODE_SIZE (GET_MODE (reg)) < GET_MODE_SIZE (word_mode)
> +	  || GET_MODE_SIZE (GET_MODE (reg))
> +	       >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (reg)))))
> +    {
> +      loc = &SUBREG_REG (reg);
> +      reg = *loc;
> +      ref_flags |= DF_REF_STRIPPED;
> +    }

I gather that what you are looking for here is cases where a SUBREG is
pulling out one (or more) registers from a multi-register value.  It's
a little more complicated than this, because not all registers have
the same size.  If you have a hard register here, you should be
looking at hard_regno_nregs.  If you have a pseudo-register, you
should use REGMODE_NATURAL_SIZE.

But I see that this is from the existing df.c code.  Hmmm.  Maybe we
can fix this later.

> +		   && ((ref_flags & DF_REF_ARTIFICIAL) == 0) )

Please remove the extra space before the final ')'.

> +/* A set to a non-paradoxical SUBREG for which the number of word_mode units
> +   covered by the outer mode is smaller than that covered by the inner mode,
> +   is a read-modify-write operation.
> +   This function returns true iff the SUBREG X is such a SUBREG.  */
> +bool
> +df_read_modify_subreg_p (rtx x)
> +{
> +  unsigned int isize, osize;
> +  if (GET_CODE (x) != SUBREG)
> +    return false;
> +  isize = GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)));
> +  osize = GET_MODE_SIZE (GET_MODE (x));
> +  return (isize > osize && isize > UNITS_PER_WORD);
> +}

Here again we should probably consider hard_regno_nregs and
REGMODE_NATURAL_SIZE.

> +/* Process all the registers defined in the pattern rtx, X.  */

Please add a note to this comment that autoincrement/decrement
definitions will be picked up by df_uses_record.

> +    case ASM_OPERANDS:
> +    case UNSPEC_VOLATILE:
> +    case TRAP_IF:
> +    case ASM_INPUT:
> +      {
> +	/* Traditional and volatile asm instructions must be considered to use
> +	   and clobber all hard registers, all pseudo-registers and all of
> +	   memory.  So must TRAP_IF and UNSPEC_VOLATILE operations.
> +
> +	   Consider for instance a volatile asm that changes the fpu rounding
> +	   mode.  An insn should not be moved across this even if it only uses
> +	   pseudo-regs because it might give an incorrectly rounded result.
> +
> +	   For now, just mark any regs we can find in ASM_OPERANDS as
> +	   used.  */
> +
> +	/* For all ASM_OPERANDS, we must traverse the vector of input operands.
> +	   We can not just fall through here since then we would be confused
> +	   by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate
> +	   traditional asms unlike their normal usage.  */
> +	if (code == ASM_OPERANDS)
> +	  {
> +	    int j;
> +
> +	    for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
> +	      df_uses_record (dflow, &ASM_OPERANDS_INPUT (x, j),
> +			      DF_REF_REG_USE, bb, insn, 0);
> +	    return;
> +	  }
> +	break;

Please add a note to this comment explaining where and how ASMs are
handled.

> +/* Process INSN recursively, and return true if it contains an asm
> +   instruction.  */
> +static bool
> +df_insn_contains_asm (rtx *loc, rtx insn)

Please rewrite this function to use for_each_rtx.

> +	  /* The stack ptr is used (honorarily) by a CALL insn.  */
> +	  x = df_reg_use_gen (STACK_POINTER_REGNUM);
> +	  df_uses_record (dflow, &XEXP (x, 0), DF_REF_REG_USE, bb, insn, 
> +			  0);

The call to df_reg_use_gen is useless, and just wastes a little bit of
memory until the next gc.  Just use
regno_reg_rtx[STACK_POINTER_REGNUM].

> +		if (global_regs[i])
> +		  {
> +		    x = df_reg_use_gen (i);
> +		    df_uses_record (dflow, &XEXP (x, 0),
> +				    DF_REF_REG_USE, bb, insn, 
> +				    0);

This df_reg_use_gen is also useless.

> +	      EXECUTE_IF_SET_IN_BITMAP (df_invalidated_by_call, 0, ui, bi)
> +		{
> +		  x = df_reg_def_gen (false, ui);
> +		  df_def_record_1 (dflow, x, bb, insn, DF_REF_CLOBBER, false);
> +		}

This df_reg_def_gen call is not 100% useless, but it could be cleaned
up with a minor change to df_def_record_1.

> +      if (CALL_P (insn))
> +	{
> +	  rtx note;
> +
> +	  /* We do not record hard registers clobbered by the call,
> +	     since there are awfully many of them and "defs" created
> +	     through them are not interesting (since no use can be legally
> +	     reached by them).  So we must just make sure we include them when
> +	     computing kill bitmaps.  */
> +
> +	  /* There may be extra registers to be clobbered.  */
> +	  for (note = CALL_INSN_FUNCTION_USAGE (insn);
> +	       note;
> +	       note = XEXP (note, 1))
> +	    if (GET_CODE (XEXP (note, 0)) == CLOBBER)
> +	      df_defs_record (dflow, XEXP (note, 0), bb, insn);
> +	}
> +    }

I don't follow the large comment here.  The code above already called
df_defs_record_1 for everything in df_invalidated_by_call.  What is it
that you are not recording?

> +      if (e->flags & (EDGE_EH))

Please remove the extra parentheses.

> +#ifdef EH_USES
> +  if ((df->flags & DF_HARD_REGS)
> +      && has_eh_preds (bb))
> +    {
> +      unsigned int i;
> +      for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +	if (EH_USES (i))
> +	  {
> +	    rtx use = df_reg_use_gen (i);
> +	    df_ref_create_structure (dflow, regno_reg_rtx[i], 
> +				     &XEXP (use, 0), NULL, EXIT_BLOCK_PTR,
> +				     DF_REF_REG_USE, 
> +				     DF_REF_ARTIFICIAL | DF_REF_AT_TOP);
> +	  }
> +    }
> +#endif

I don't think this is right (though maybe I said otherwise on IRC).
The only target which actually defines EH_USES is ia64.  I think that
if EH_USES is set, it means that the register must be valid when an
exception is taken.  Therefore what matters for EH_USES is not
has_eh_preds, but has_eh_succs.

> +		  bitmap_clear_range(bb_info->gen, begin, n_uses);

Please add a space before the '('.

> +struct df_rd_problem_data
> +{
> +  bitmap *def_sites;            /* Bitmap of defs for each pseudo.  */
> +  unsigned int def_sites_size;  /* Size of def_sites.  */
> +  bitmap sparse_invalidated_by_call;    /* The set of defs to regs invalidated by call.  */
> +  bitmap dense_invalidated_by_call;   /* The set of defs to regs invalidate by call for ru.  */
> +};

Please make sure lines are not more than 80 characters long, probably
by moving some of the comments to separate lines.

> +#define DF_RD    2      /* Reachind Defs. */

Typo.




More information about the Gcc-patches mailing list