This is the mail archive of the
mailing list for the GCC project.
Re: Question about PR 48814 and ivopts and post-increment
- From: Jeff Law <law at redhat dot com>
- To: Steve Ellcey <sellcey at imgtec dot com>, gcc at gcc dot gnu dot org
- Date: Tue, 1 Dec 2015 15:50:07 -0700
- Subject: Re: Question about PR 48814 and ivopts and post-increment
- Authentication-results: sourceware.org; auth=none
- References: <b579d986-4948-42fe-817a-939807204ad4 at BAMAIL02 dot ba dot imgtec dot org>
On 12/01/2015 02:11 PM, Steve Ellcey wrote:
I'd start by looking at the code prior to reorg/delay slot scheduling.
It may be the case that you're running into the well known issue that
when reorg knows nothing about latency/scheduling issues and happily
picks whatever insn can safely fill the delay slot. In doing so, reorg
may muck up the schedule badly.
With the current top-of-tree we now generate:
lbu $2,-1($5) # This is a branch delay slot
addiu $4,$4,1 # This is a branch delay slot
subu $2,$3,$2 # Done only once now after exiting loop.
The main problem with the new loop is that the beq comparing $2 and $3
is right before the load of $2 so there can be a delay due to the time
that the load takes. The ideal code would probably be:
If that's the case you might test disallowing operations with > 1 cycle
latency in delay slots and see how that effects a wider range of benchmarks.