This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: rtlopt merge part 2 - bb-reorder rewrite (updated)


On Fri, Feb 07, 2003 at 10:05:31AM +0100, Josef Zlomek wrote:
> this is a rewrite of bb-reorder. It performs also a loop rotation
> and duplication of some small hot basic blocks (especially duplicating 
> the return instruction should have significant effect).

This I wanted to see very much.  Thanks.
In general, the code looks pretty good.

> !   start_of_trace = xmalloc (array_size * sizeof (int));
> !   end_of_trace = xmalloc (array_size * sizeof (int));
> !   bb_heap = xmalloc (array_size * sizeof (fibheap_t));
> !   bb_node = xmalloc (array_size * sizeof (fibnode_t));

You'll get better cache locality if you rearrange these
into one array of structures.  Also, don't I see bb_heap
always initialized?  Perhaps

> !       if (bb_heap)
> ! 	{
> ! 	  bb_heap = xrealloc (bb_heap, new_size * sizeof (fibheap_t));

is leftover from somewhen?

> ! static int
> ! get_uncond_jump_length ()
> ! {
> !   rtx label, jump;
> !   int length;
> ! 
> !   label = emit_label_before (gen_label_rtx (), get_insns ());
> !   jump = emit_jump_insn (gen_jump (label));
> ! 
> !   length = get_attr_length (jump);
> ! 
> !   delete_insn (jump);
> !   delete_insn (label);
> !   return length;
>   }

It'd be nice to call this only once. 

Have you looked to see that this produces reasonable results
for targets for which branch shortening is used?  Mostly I 
mean "doesn't crash", since this is only a heuristic, but
"returns max size" would be nice.

Finally, some general issues:

(1) At present this is being run after sched2, which means that the block
    duplication we're doing isn't as useful as it could be.  If we duplicate
    block C and place it after block B, we'd like to be able to interleave
    the insns from C and B.

    The current placement of bb-reorder stems from wanting to do the reorder
    after reg-stack does it's nastiness.  Have you looked into what the
    effects of running bb-reorder twice?  If this is too painful, we could
    predicate the placement on whether reg-stack will be run...

(2) The Alpha port pulls a trick where, post-reload, it exposes some
    relocations to the scheduler.  The best example is a call,

	(call_insn 109 107 99 9 0x20000715550 (parallel [
		    (set (reg:DI 0 $0)
			(call (mem:DI (symbol_ref:DI ("printf")) [0 S8 A64])
			    (const_int 0 [0x0])))
		    (use (reg:DI 29 $29))
		    (clobber (reg:DI 26 $26))
		]) 292 {*call_value_osf_1_er}
	    (insn_list 107 (insn_list 108 (nil)))
	    (nil)
	    (expr_list (use (reg:DI 17 $17))
		(expr_list (use (reg:DI 16 $16))
		    (nil))))

    being expanded to 

	(insn 321 114 322 0x20000715550 (set (reg:DI 27 $27)
		(unspec:DI [
			(reg:DI 29 $29)
			(symbol_ref:DI ("printf"))
			(const_int 10 [0xa])
		    ] 11)) 230 {movdi_er_high_g} (nil)
	    (nil))

	(call_insn:TI 322 321 323 0x20000715550 (parallel [
		    (set (reg:DI 0 $0)
			(call (mem:DI (reg:DI 27 $27) [0 S8 A64])
			    (const_int 0 [0x0])))
		    (set (reg:DI 26 $26)
			(plus:DI (pc)
			    (const_int 4 [0x4])))
		    (unspec_volatile [
			    (reg:DI 29 $29)
			] 1)
		    (use (symbol_ref:DI ("printf")))
		    (use (const_int 10 [0xa]))
		]) 293 {*call_value_osf_2_er}
	    (insn_list:REG_DEP_ANTI 321 (insn_list:REG_DEP_ANTI 106
	      (insn_list:REG_DEP_ANTI 107 (insn_list:REG_DEP_ANTI 114 (nil)))))
	    (expr_list:REG_DEAD (reg:DI 16 $16)
		(expr_list:REG_DEAD (reg:DI 17 $17)
		    (expr_list:REG_DEAD (reg:DI 29 $29)
			(expr_list:REG_DEAD (reg:DI 27 $27)
			    (expr_list:REG_UNUSED (reg:DI 0 $0)
				(nil))))))
	    (expr_list (use (reg:DI 17 $17))
		(expr_list (use (reg:DI 16 $16))
		    (nil))))

	(insn 323 322 324 0x20000715550 (set (reg:DI 29 $29)
		(unspec_volatile:DI [
			(reg:DI 26 $26)
			(const_int 11 [0xb])
		    ] 10)) 260 {*ldgp_er_1}
	    (insn_list:REG_DEP_ANTI 322 (nil))
	    (expr_list:REG_DEAD (reg:DI 26 $26)
		(nil)))

	(insn:TI 324 323 99 0x20000715550 (set (reg:DI 29 $29)
		(unspec:DI [
			(reg:DI 29 $29)
			(const_int 11 [0xb])
		    ] 10)) 261 {*ldgp_er_2}
	    (insn_list 323 (nil))
	    (nil))

    which corresponds to

	ldq	$27, printf($29)	!literal!10
	jsr	$26, ($27), 0		!lituse_jsr!10
	ldah	$29, 0($26)		!gpdisp!11
	lda	$29, 0($29)		!gpdisp!11

    The thing about these relocations is, they have a sequence number
    attached to them for the assembler to process (the 10 and 11 in the
    example).  This sequence number indicates certain data flow properties
    about these instructions.

    With your patch, however, we've destroyed that sequence information,
    since we now for the first time we duplicate instructions after
    peep2.  Indeed, the assembler issues error messages for duplicate
    sequence numbers.

    The solution, I think is to have a target hook that the backend can
    use to frob these sequence numbers.  Everywhere but Alpha we'd do
    nothing (or just call copy_rtx, however it gets set up).

    I'll take care of this myself, I guess, since I know what's going
    on and can test the result.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]