This is the mail archive of the
mailing list for the GCC project.
Re: [GCC RFC]A new and simple pass merging paired load store instructions
- From: Jeff Law <law at redhat dot com>
- To: Mike Stump <mikestump at comcast dot net>
- Cc: "bin.cheng" <bin dot cheng at arm dot com>, gcc-patches at gcc dot gnu dot org
- Date: Thu, 15 May 2014 14:01:20 -0600
- Subject: Re: [GCC RFC]A new and simple pass merging paired load store instructions
- Authentication-results: sourceware.org; auth=none
- References: <004d01cf700e$ef1e30e0$cd5a92a0$ at arm dot com> <32B4330F-1D0F-4D4E-BF7A-2E5B2148B893 at comcast dot net> <5374F59D dot 3030101 at redhat dot com> <11986FC6-32E8-4C4F-AA8D-9B9E8C093FD5 at comcast dot net>
On 05/15/14 12:41, Mike Stump wrote:
On May 15, 2014, at 10:13 AM, Jeff Law <firstname.lastname@example.org> wrote:
I've poked at the scheduler several times to do similar stuff, but
was never really satisfied with the results and never tried to
polish those prototypes into something worth submitting.
What was lacking? The cleanliness of the patch or the, it didn’t win
me more than 0.01% code improvement so I never submitted it?
Cleanliness and applicability of my implementation.
For fmpyadd/fmpysub on the PA, my recollection was that the right way to
go was to detect when both were in the ready queue at the same time,
then resort the queue to have them issue consecutively. We didn't have
the necessary hooks to have target dependent code examine and rearrange
the queues back when I looked at this. By the time we had those
capabilities, the PA8000 class processors had been released and those
instructions were considered bad so I never came back to this.
For the memory optimizations, IIRC, the dependencies keep them from
getting into the ready queue at the same time. Thus it's significantly
harder to get them to issue consecutively when you've got an issue rate > 1.
But if you've got an issue rate > 1, then it's a lot less likely you'll
get the store/load pairing up the way you want.
Trivial to do with my patch, just change the sort key to arrange that
to happen, however, if you want multiple support and this, then, you
need to recognize store_m / load from a part of that, which would be
Arguably one could throttle down the issue rate for these scheduler-like
passes which makes that problem go away.