This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] move superblock formation to Tree-SSA


Hi Bob,

On Tuesday 26 December 2006 18:15, Robert Kidd wrote:
> This patch moves Superblock formation from RTL to the Tree-SSA
> representation.  This patch does not cause a significant change in
> performance in itself,

Can you quantify the performance changes with this patch, both of
the generated code and of compilation time?

What happened with the twolf regression you had before with this
patch?

It may be helpful to schedule the pass after one round of CCP and
DCE.  The amount of dead code just after rewriting into SSA form is
sometimes very high.


> but it is a first step to performing high 
> level structural optimizations.

Anything specific in mind, or in the works?


> -/* Return true if BB has been seen - it is connected to some trace
> -   already.  */
> +/* A bit BB->index is set if BB has already been seen, i.e. it is
> +   connected to some trace already.  */
> +bitmap bb_seen;

You really want an sbitmap here.  You'll be setting this for all basic
blocks IIUC, and bitmap is a potentially quadratic structure.  You can
use an initially oversized sbitmap instead, and when you have created
more new basic blocks than the number of extra bits you asked for, you
can sbitmap_resize the thing.

I'm assuming you always immediately mark newly created basic blocks as
visited.  If not, you have to watch that you don't query bits beyond
the size of the sbitmap.


> -      if (seen (bb2) || (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
> -	  || find_best_successor (bb2) != e)
> +      if (bb_seen_p (bb2) || (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
> + 	  || find_best_successor (bb2) != e)

I don't see a change between the two find_best_successor lines. 
Accidental whitespace change there???


> +#ifdef ENABLE_CHECKING
> +      gcc_assert (!bb_seen_p (bb));
> +#endif

No need for the ENABLE_CHECKING test.


>  /* Main entry point to this file.  FLAGS is the set of flags to pass
>     to cfg_layout_initialize().  */
>  
> -void
> -tracer (unsigned int flags)
> +static unsigned int
> +tracer (void)
>  {
>    if (n_basic_blocks <= NUM_FIXED_BLOCKS + 1)
> -    return;
> +    return 0;
>  
> -  cfg_layout_initialize (flags);
>    mark_dfs_back_edges ();

Need to update the comment before this function.  It has no FLAGS
argument so it does not pass it to cfg_layout_initialize(), in fact
it doesn't even have to call that function anymore...


> +  /* FIXME: We really only need to do this when we know tail duplication
> +            has altered the CFG. */
> +  free_dominance_info (CDI_DOMINATORS);

Alternatively, you could update dominance information on the fly.  I
have no idea whether this difficult with the CFG transformations that
you do.


> +/* The container for a group of structural transformation and
> +   optimization */

Nit: Comments in GCC should end with a period followed by two spaces.


> -  0,                                    /* todo_flags_start */
> -  TODO_dump_func,                       /* todo_flags_finish */
> -  'T'                                   /* letter */
> +  TODO_dump_func,                       /* todo_flags_start */
> +  TODO_dump_func
> +    | TODO_update_ssa
> +    | TODO_verify_ssa,                  /* todo_flags_finish */
> +  0                                     /* letter */
>  };

Not sure if we want to dump before and after the transformations.
Usually you can just look at the dump of the previous pass.  I don't
think we have many passes that dump before and after.  Following the
common practice is usually a good idea ;-)


Regarding Honza's comments about the placement in the pass queue: 
Although it's hard to tell without numbers, I don't think the
placement of this pass currently matters very much, because:

1. the heuristics to pick the code to duplicate is pretty arbitrary
2. by running tracer earlier we expose optimization opportunities
   earlier.

We should look at the numbers, but I wouldn't expect much of a
difference.

A much better approach to code duplication in the later stages of the
compiler, would be to let the scheduler decide.  ISTR this is part of the
work that the folks at RAS are doing.

If there is no significant performance impact of removing the RTL level
superblock formation, I don't see a good reason to retain it.

Gr.
Steven





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]