[PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

H.J. Lu hjl.tools@gmail.com
Tue Jun 14 13:24:00 GMT 2011

On Tue, Jun 14, 2011 at 3:16 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jun 14, 2011 at 12:13:47PM +0200, Richard Guenther wrote:
>> On Tue, Jun 14, 2011 at 1:59 AM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote:
>> > The patch ( http://gcc.gnu.org/ml/gcc-patches/2011-02/txt00059.txt ) which introduces splitting avx256 unaligned loads.
>> > However, we found that it causes significant regressions for cpu2006 ( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49089 ).
>> >
>> > In this work, we introduce a tune option that sets splitting unaligned loads default only for such CPUs that such splitting
>> > is beneficial.
>> >
>> > The patch passed bootstrapping and regression tests on x86_64-unknown-linux-gnu system.
>> >
>> > Is it OK to commit?
>> It probably should go to the 4.6 branch as well.  Note that I find the
>> why not call it simply X86_TUNE_AVX256_SPLIT_UNALIGNED_LOAD?
> I also wonder what we should do for -mtune=generic.  Should we split or not?
> How big improvement is it on Intel chips, how big degradation does it
> cause on AMD chips (I assume no other chip maker currently supports AVX)?

Simply turning off 32byte aligned load split, which introduces
performance regressions on
Intel Sandy Bridge processors, isn't an appropriate solution.

I am proposing a different approach so that we can improve
-mtune=generic performance
on current Intel and AMD processors.

The current default GCC tuning, -mtune=generic, was implemented in
2005 for Intel
Pentium 4, Core 2 and AMD K8 processors.  Many optimization choices
are no longer
applicable to the current Intel nor AMD processors.

We should choose a set of optimization choices for -mtune=generic,
including 32byte
unaligned load split, for the current Intel and AMD processors,  which
should improve
performance with no performance regressions.


More information about the Gcc-patches mailing list