This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, i386] Avoid LCP stalls (issue5975045)


On Wed, Apr 4, 2012 at 5:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Apr 4, 2012 at 5:07 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> New patch to avoid LCP stalls based on feedback from earlier patch. I modified
>> H.J.'s old patch to perform the peephole2 to split immediate moves to HImode
>> memory. This is now enabled for Core2, Corei7 and Generic.
>>
>> I verified that this enables the splitting to occur in the case that originally
>> motivated the optimization. If we subsequently find situations where LCP stalls
>> are hurting performance but an extra register is required to perform the
>> splitting, then we can revisit whether this should be performed earlier.
>>
>> I also measured SPEC 2000/2006 performance using Generic64 on an AMD Opteron
>> and the results were neutral.
>>
>
> What are the performance impacts on Core i7? I didn't notice any significant
> changes when I worked on it for Core 2.

One of our street map applications speeds up by almost 5% on Corei7
and almost 2.5% on Core2 from this optimization.  It contains a hot
inner loop with some conditional writes of zero into a short array.
The loop is unrolled so that it does not fit into the LSD which would
have avoided many of the LCP stalls.

Thanks,
Teresa

>
> Thanks.
>
> --
> H.J.



-- 
Teresa Johnson?|?Software Engineer?|?tejohnson@google.com?|?408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]