This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: SPARC code inefficiency


   From: Dan Nicolaescu <dann@godzilla.ICS.UCI.EDU>
   Date: Tue, 21 May 2002 23:13:54 -0700

   
   Look at all the uses for the %o2 register:
   
   32 lines matching "o2" in buffer md5.s.
        11:	add	%fp, -80, %o2
        21:	mov	%o2, %i5
        40:	ld	[%o2], %i0
        53:	add	%o2, 4, %o2
 ...
   
   all the above "add" instructions can be eliminated by using a reg + offset
   addressing mode. 
   
   Adding some peephole2s could solve this... Is there a better way? 
   
   The source code and assembly are attached. 
   
There is nothing sparc specific about this lack of optimization.
Peepholes won't help at all because they cannot transform things
globally which is what needs to happen here.

I don't know if any of the generic optimization passes are already
supposed to handle this, but that is the kind of thing needed to
make the transformation you are looking for.

And BTW, with -mtune=ultrasparc the schedule is much better.
So it may not make any difference for the %o2 advancing problem
but it makes a HUGE difference when this is to be executed on
an UltraSPARC processor.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]