This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[sel-sched] Speed up the scheduler


Hello,

These are the four patches that speed up the selective scheduler. The first patch implements the code motion traversals, move_op for moving up an insn and find_used_regs for finding regs for renaming, through a single driver, code_motion_path_driver. The differences from behavior are implemented through hooks. The patch is implemented by Dmitry Melnik.

The second and third patch speed up the checking of whether insn's data is ready. We do this via tick_check_p function, which looks for dependencies between the fence's dependence context and the insn and checks their latency. First, we determine the maximal latency of dependences which insn produces, say ML, and we remove the insn from the fence dependence context when ML cycles have passed. For this purpose, we needed an extra function in genautomata which outputs this maximal latency. Second, we cache the results of tick_check_p calls on the fence, so that we do not reanalyze the insn on every cycle.

The fourth patch brings the largest speedups and removes the assumption that the fence starts on a basic block header insn. As a result, we do fewer updates of availability sets. The patch required some bug fixing and cleanups. Also, now we ensure that during pipelining we operate on regions that have at most one back edge.

Tested on ia64, committed to sel-sched branch.

Andrey

Attachment: kill-floating-headers.diff.gz
Description: application/gzip

2008-04-14  Andrey Belevantsev  <abel@ispras.ru>
	* genattr.c (main): Output maximal_insn_latency prototype.
	* genautomata.c (output_default_latencies): New. Factor its code from ... 
	(output_internal_insn_latency_func): ... here.
	(output_internal_maximal_insn_latency_func): New.
	(output_maximal_insn_latency_func): New.
	* hard-reg-set.h (UHOST_BITS_PER_WIDE_INT): Fix define.
	* lists.c (remove_free_EXPR_LIST_node): New.
	* rtl.h: Export it.
	* sched-deps.c (remove_from_dependence_list, 
	remove_from_both_dependence_lists): New.
	(remove_from_deps): New. Use the above functions.
	* sched-deps.h (remove_from_deps): Export.
	* sel-sched-ir.h (struct _fence): New field `executing_insns'.
	(FENCE_EXECUTING_INSNS): New accessor.
	(struct _sel_insn_data): Remove _expr to expr.  Update all uses.
	Change asm_p to bool_bitfield. New field `ready_cycle'.
	* sel-sched-ir.c (fence_init, flist_add, fence_clear, 
	init_fences, merge_fences, new_fences_add, new_fences_add_clean, 
	new_fences_add_dirty): Update for FENCE_EXECUTING_INSNS.
	* sel-sched.c (advance_one_cycle): Remove excessive insns from 
	FENCE_EXECUTING_INSNS.
	(undo_transformations): Forbid combined speculation.
	(process_use_exprs): Use EXPR_TARGET_AVAILABLE.
	(fill_insns): Set INSN_READY_CYCLE.  Update FENCE_EXECUTING_INSNS.
	(sel_sched_region_2): Likewise.
	
	

Attachment: remove-insns-from-deps-context.diff.gz
Description: application/gzip

2008-04-14  Andrey Belevantsev  <abel@ispras.ru>

	* sched-deps.c (sched_deps_init): Tidy.
	* sel-sched-ir.c (init_fence_for_scheduling): New.
	(flist_add): Use it.
	(init_fences): Merge ready_ticks_size.
	(merge_fences): Likewise.
	(new_fences_add): Rename to add_to_fences.
	(move_fence_to_fences): New.
	(new_fences_add_clean): Rename to move_fence_to_fences.
	(new_fences_add_dirty): Rename to add_dirty_fence_to_fences.
	(insn_eligible_for_subst_p): Kill.

	* sel-sched-ir.h (ready_ticks, ready_ticks_size): New.
	* sel-sched.c (extract_new_fences_from,
	sel_sched_region_2): Use the new fence functions.
	(can_substitute_through_p): New.
	(moveup_expr): Use it.
	(can_overcome_dep_p): Rename to can_speculate_dep_p.
	(fill_vec_av_set): Use FENCE_READY_TICKS.

Attachment: speedup-fences.diff.gz
Description: application/gzip

2008-04-14  Dmitry Melnik  <dm@ispras.ru>

	* sel-sched-dump.c (get_print_blocks_num): New.
	* sel-sched-dump.h: Export it.
	* sel-sched-ir.c (vinsn_equal_p): Use sel_rtx_equal_p for UNIQUE 
	vinsns too.
	(speculate_expr): Pass false to create_vinsn_from_insn_rtx.
	(create_vinsn_from_insn_rtx): New parameter force_unique_p.
	Pass it to vinsn_create.
	* sel-sched-ir.h (struct _def): Add comment to crosses_call.
	* sel-sched.c (struct cmpd_local_params, 
	struct moveop_static_params, struct fur_static_params,
	fur_static_params_p, cmpd_local_params_p, moveop_static_params_p,
	struct code_motion_path_driver_info_def,
	code_motion_path_driver_info, move_op_hooks, fur_hooks): New.
	(substitute_reg_in_rhs): Pass false to create_vinsn_from_insn_rtx.
	(replace_dest_with_reg_in_rhs, generate_bookkeeping_insn): Likewise.
	(code_motion_path_driver): New.
	(find_used_regs, move_op): Rewrite to use it.  Update all uses.
	(find_used_regs_1): Kill.
	(av_set_could_be_blocked_by_bookkeeping_p): New.
	(move_op_merge_succs, fur_merge_succs, move_op_after_merge_succs,
	move_op_orig_rhs_found, fur_orig_rhs_found, move_op_at_bb_head,
	fur_at_bb_head, move_op_ascend, fur_on_enter, 
	move_op_orig_rhs_not_found, fur_orig_rhs_not_found,
	move_op_process_successors, code_motion_path_driver_cleanup): New.

Attachment: furore-merge.diff.gz
Description: application/gzip

2008-04-14  Andrey Belevantsev  <abel@ispras.ru>

	* cfgloopmanip.c (has_preds_from_loop): New.
	(create_preheader): Use it.
	* sched-deps.c (deps_analyze_insn): Tidy.
	* sel-sched-ir.c (vinsns_correlate_as_rhses_p): Rename to vinsn_equal_p,
	deleting the latter.  Update all uses.
	(vinsn_copy): New.
	(find_in_history_vect_1): New parameter compare_vinsns.
	Do not compare when we're undoing transformations on a bookeeping copy.
	(find_in_history_vect, insert_in_history_vect): Use the new parameter.
	(merge_expr_data): Add usefulness only when merging on a split point.
	(has_dependence_p): Do not allow stores to move through checks.
	(init_insn): Properly init EXPR_TARGET_AVAILABLE and INSN_LIVE_VALID_P.
	(clear_outdated_rtx_info): Do not rely on INSN_TRANSFORMED_INSNS 
	when rescheduling.
	(sel_remove_loop_preheader): Tidy.

	* sel-sched.c (struct code_motion_path_driver_info_def): Add new 
	parameter to on_enter field. Rename at_bb_head to at_first_insn.
	(need_stall): Move to fill_insns.
	(vec_bk_blocked_exprs): Rename to vec_blocked_vinsns.
	(extract_new_fences_from): Fix for the case when a fence is not
	on a bb header.
	(substitute_reg_in_expr): New parameter undo.  Perform or undo 
	the change based on this parameter.
	(find_best_reg_for_expr): Use find_sequential_best_exprs.
	When renaming, set EXPR_TARGET_AVAILABLE to 1.
	(undo_transformations): When un-substituting through bookkeeping, 
	do not use the history data, but substitute_reg_in_expr instead.
	(moveup_expr_inside_insn_group): Do not care about any dependencies
	except substitutable ones. 
	(try_bitmap_cache, try_transformation_cache, update_bitmap_cache,
	update_transformation_cache): Split from ... 
	(moveup_set_expr): ... here.  
	(moveup_expr_cached): New function.
	(moveup_set_path*): Kill and rewrite into ... 
	(moveup_set_inside_insn_group): ... this.
	(equal_after_moveup_path_p): Rewrite without recursion.
	(compute_av_set_inside_bb): Check ineligibility for every insn.
	Leave a copy only on bb headers.
	(update_data_sets): Update only on bb headers.
	(expr_blocked_by_bookkeeping_p): Use vec_blocked_vinsns.
	(clear_blocked_exprs, add_to_blocked_exprs, free_blocked_exprs): New.
	(fill_vec_av_set): Remove unavailable due to bookeeping expressions
	even if they are separable.  Tidy.  Compute the minimal stall
	needed to be able to try any of the expressions in the ready set.
	(fill_ready_list, find_best_expr): Propagate the stall needed from
	fill_vec_av_set to fill_insns.
	(move_cond_jump): Fix for the case when a fence is not on a bb header.
	(compute_av_set_on_boundaries, find_sequential_best_exprs): Likewise.
	Handle substitutions inside insn group.
	(prepare_place_to_insert): Likewise.
	(move_exprs_to_boundary): Likewise for the case when more than one 
	expression corresponds to expr_vliw.
	(fill_insns): Stall for more than one cycle if needed.
	Check that we never create an extra back edge in a region when 
	pipelining.
	(update_and_record_unavailable_insns): Also update liveness in 
	the middle of the bookkeeping block.
	(move_op_at_first_insn): Handle the case when the insn is not 
	a bb header.
	(fur_at_first_insn): Likewise.
	(fur_on_enter): Move the handling of visited blocks to 
	code_motion_path_driver.
	(move_op_on_enter): New.
	(code_motion_process_successors): Rescan when basic block
	numbers have changed due to bb splitting.
	(code_motion_path_driver): Update for fences not on bb headers.
	(sel_region_init, sel_region_finish): Update liveness on single-block 
	loops when pipelining.  Use *_blocked_exprs routines.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]