This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: CFG merge part 7 - superblock/trace scheduling
- From: Jan Hubicka <jh at suse dot cz>
- To: Jan Hubicka <jh at suse dot cz>
- Cc: gcc-patches at gcc dot gnu dot org, rth at redhat dot com, vmarakov at redhat dot com,edelson at gnu dot org
- Date: Wed, 19 Feb 2003 13:53:32 +0100
- Subject: Re: CFG merge part 7 - superblock/trace scheduling
- References: <20030207115639.GK18788@kam.mff.cuni.cz>
> Hi,
> This patch adds option to enable superblock scheduling (aka EBB scheduler) for
> non-IA-64 targets as well as "trace scheduling" (aka EBB scheduler preceeded by
> the trace formation).
> My experience with this feature are bit mixed - by default it causes 2%
> regression on Athlon. By significantly improving MD description I was able to
This regression seems to go away after changing the superblock choice
algorithm I sent in merge patch 11.
> get about 0.3%/0.8% speedup on Specint/Specfp, that is about the 1/3rd of
> effect of scheduling overall on this target. I hope I will have time to
> cleanup and send this part soon too as well. The improvements are noticeable
> in local scheduler as well, but since the basic blocks are small they are much
> less visible.
> --- 3495,3516 ----
>
> split_all_insns (1);
>
> ! if (flag_sched2_use_superblocks || flag_sched2_use_traces)
> ! {
> ! if (!flag_sched2_use_traces)
> ! reorder_basic_blocks ();
> ! else
> ! tracer ();
> ! cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_UPDATE_LIFE);
> ! schedule_ebbs (rtl_dump_file);
> ! /* No liveness updating code yet, but it should be easy to do */
> ! count_or_remove_death_notes (NULL, 1);
> ! life_analysis (get_insns (), rtl_dump_file, PROP_DEATH_NOTES);
> ! cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_UPDATE_LIFE
> ! | (flag_crossjumping ? CLEANUP_CROSSJUMP : 0));
> ! }
> ! else
> ! schedule_insns (rtl_dump_file);
We've chatted briefly about this. Should I change it to move
reorder_basic_block pass in front of sched2 pass and re-do
reordering only in case we did eigher some cleanup_cfg or reg-stack
conversion?
Since trace scheduling can make blocks to be empty, it is not that
incommon that cleanup_cfg match after it. I am not quite sure that this
is desirable at all. Perhaps we can avoid any cfg_cleanup post
scheduling and do it only for i386 reg-stack where we do have pass
ordering problem?
It results in bigger binaries and don't help for Athlon, but on in-order
architectures this may be different.
Honza