This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Optimisation pass for dual pack architectures


On Fri, Jan 22, 1999 at 11:10:59PM +1300, Michael Hayes wrote:
> The most important optimisation involves packing a pair of
> instructions with a data dependency that occur within a loop.

This looks like a generally useful special case of software
pipelining.  It has some special combine-like knowledge about
constructing parallels, but is useful beyond two-packs.

> is transformed into:
> 
>    (set (reg r0) (mult (mem (post_inc ar0)) (mem (post_inc ar1))))
>    (repeat (N - 1) [
>        (parallel [
>            (set (reg r0) (mult (mem (post_inc ar0)) (mem (post_inc ar1))))
>            (set (reg r1) (plus (reg r1) (reg r0)))])])
>    (set (reg r1) (plus (reg r1) (reg r0)))

	(set (reg r0) (mem (reg a0)))
	(set (reg a0) (plus (reg a0) (const_int 4)))
	(set (reg r1) (mem (reg a1)))
	(set (reg a1) (plus (reg a1) (const_int 4)))
	(set (reg r2) (mult (reg r0) (reg r1)))
	(repeat (N - 1) [
	  (set (reg r0) (mem (reg a0)))
	  (set (reg a0) (plus (reg a0) (const_int 4)))
	  (set (reg r1) (mem (reg a1)))
	  (set (reg a1) (plus (reg a1) (const_int 4)))
	  (set (reg r3) (plus (reg r3) (reg r2)))
	  (set (reg r2) (mult (reg r0) (reg r1)))
	  ])
	(set (reg r3) (plus (reg r3) (reg r2)))

... on a traditional RISC machine sans madd insn.  It doesn't do much
for memory latency, but does hide most of the 4 cycle fp mult and add
latency that's typical.

It looks like good code -- I'd just hesitate to name it as you did,
so strongly suggesting it's only useful for multipack targets.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]