[PATCH, rs6000, testsuite] Changes for unaligned vector load/store support on POWER8
David Edelsohn
dje.gcc@gmail.com
Thu Jan 29 20:30:00 GMT 2015
On Fri, Jan 23, 2015 at 12:23 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> The POWER8 processor greatly improves performance of unaligned vector
> loads and stores. Except for certain corner cases the compiler can't
> readily track, an unaligned vector load or store performs equivalently
> to an aligned one.
>
> To exploit this in the auto-vectorizer requires two changes. The simple
> one is to change the cost model to reflect the cheaper cost for POWER8
> versus previous processors. Additionally, we want to avoid generating
> the masked-load sequence (load/load/lvsl/vperm) used to force alignment
> of unaligned loads. Of course we must be careful to still use this
> sequence if -mno-vsx is selected for POWER8 for some reason.
>
> (Note that POWER7 supported unaligned vector memory references, but for
> best performance we have chosen to use the masked-load sequence for that
> processor. This is no longer optimal for POWER8.)
>
> The code changes in the rs6000 back end are simple enough, but
> unfortunately there is quite a bit of test case fallout. There are two
> predicates in target-supports.exp that need adjustment:
> * vect_no_align, which returns 1 iff the target plus current options
> does not support a vector alignment mechanism; and
> * vect_hw_misalign, which returns 1 iff the target supports a
> misaligned vector access.
>
> Unlike previous processors, for POWER8+VSX we want both of these
> predicates to return 1. In the former case, it isn't that we don't
> support a vector alignment mechanism, but under no circumstances do we
> want to use it when we have a misaligned vector access instruction that
> performs well.
>
> As a result of these changes, many loops will now auto-vectorize on P8
> that would not on P7. Unfortunately, this causes many tests to fail,
> even with the changes to target-supports.exp. The primary reason is
> that the tests are testing vect_no_align in many places, where the
> correct thing to test is vect_no_align && !vect_hw_misalign. That is,
> the test condition should fire not just when there isn't a vector
> alignment mechanism, but only when the target also doesn't support a
> direct misaligned vector memory access.
>
> The reason this "shortcut" has worked up till now is that the set of
> targets for which vect_no_align and vect_hw_misalign both return 1 has
> been empty. Thus vect_no_align has been functionally equivalent to
> vect_no-align && !vect_hw_misalign. Happily, this means it is also safe
> to make that substitution in the failing tests without affecting other
> targets.
>
> So, this patch contains three parts:
> * Changes to rs6000.c and vector.md for the vectorization support;
> * Changes to target-supports.exp to reflect POWER8's characteristics;
> and
> * Numerous changes to fix test cases to make them pass/fail correctly
> for POWER8.
>
> I've tested this on POWER8 BE, POWER8 LE, and POWER7 BE, with no
> regressions. A handful of existing POWER8 failures are also corrected
> as a happy side effect.
>
> Since we're in stage 4, I obviously need to hold off till the next
> release, but pending that, will this be ok for trunk? After it burns in
> I would like to backport it to 4.8, 4.9, and 5.
>
> Thanks!
>
> Bill
>
>
> [gcc]
>
> 2015-01-23 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
>
> * config/rs6000/rs6000.c (rs6000_builtin_mask_for_load): Return 0
> for POWER8 so that the vectorizer will use direct unaligned loads.
> (rs6000_builtin_support_vector_misalignment): Always return true
> for VSX + POWER8.
> (rs6000_builtin_vectorization_cost): Cost of unaligned loads and
> stores on VSX + POWER8 is almost always the same as the cost of an
> aligned load or store, so model it that way.
The processor test should be centralized in option_override instead
explicitly testing POWER8. We don't want to have to hunt for and
update multiple, explicit processor tuning references if a potential
future processor with the same characteristics is supported.
> * config/rs6000/vector.md (movmisalign<mode>): Misaligned loads
> and stores are always permissible for VSX + POWER8.
Why test POWER8 separately in this pattern instead of setting
TARGET_ALLOW_MOVMISALIGN for POWER8, if the use did not explicitly
invoke the option?
>
> [gcc/testsuite]
>
> 2015-01-23 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
>
> * gcc.dg/vect/bb-slp-24.c: Exclude test for POWER8.
> * gcc.dg/vect/bb-slp-25.c: Likewise.
> * gcc.dg/vect/bb-slp-29.c: Likewise.
> * gcc.dg/vect/bb-slp-32.c: Replace vect_no_align with
> vect_no_align && { ! vect_hw_misalign }.
> * gcc.dg/vect/bb-slp-9.c: Likewise.
> * gcc.dg/vect/costmodel/ppc/costmodel-slp-33.c: Exclude test for
> vect_hw_misalign.
> * gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Likewise.
> * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust tests to
> account for POWER8, where peeling for alignment is not needed.
> * gcc.dg/vect/costmodel/ppc/costmodel-vect-outer-fir.c: Replace
> vect_no_align with vect_no_align && { ! vect_hw_misalign }.
> * gcc.dg.vect.if-cvt-stores-vect-ifcvt-18.c: Likewise.
> * gcc.dg/vect/no-scevccp-outer-6-global.c: Likewise.
> * gcc.dg/vect/no-scevccp-outer-6.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-43.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-57.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-61.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise.
> * gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
> * gcc.dg/vect/pr16105.c: Likewise.
> * gcc.dg/vect/pr20122.c: Likewise.
> * gcc.dg/vect/pr33804.c: Likewise.
> * gcc.dg/vect/pr33953.c: Likewise.
> * gcc.dg/vect/pr56787.c: Likewise.
> * gcc.dg/vect/pr58508.c: Likewise.
> * gcc.dg/vect/slp-25.c: Likewise.
> * gcc.dg/vect/vect-105-bit-array.c: Likewise.
> * gcc.dg/vect/vect-105.c: Likewise.
> * gcc.dg/vect/vect-27.c: Likewise.
> * gcc.dg/vect/vect-29.c: Likewise.
> * gcc.dg/vect/vect-33.c: Exclude unaligned access test for
> POWER8.
> * gcc.dg/vect/vect-42.c: Replace vect_no_align with vect_no_align
> && { ! vect_hw_misalign }.
> * gcc.dg/vect/vect-44.c: Likewise.
> * gcc.dg/vect/vect-48.c: Likewise.
> * gcc.dg/vect/vect-50.c: Likewise.
> * gcc.dg/vect/vect-52.c: Likewise.
> * gcc.dg/vect/vect-56.c: Likewise.
> * gcc.dg/vect/vect-60.c: Likewise.
> * gcc.dg/vect/vect-72.c: Likewise.
> * gcc.dg/vect/vect-75-big-array.c: Likewise.
> * gcc.dg/vect/vect-75.c: Likewise.
> * gcc.dg/vect/vect-77-alignchecks.c: Likewise.
> * gcc.dg/vect/vect-77-global.c: Likewise.
> * gcc.dg/vect/vect-78-alignchecks.c: Likewise.
> * gcc.dg/vect/vect-78-global.c: Likewise.
> * gcc.dg/vect/vect-93.c: Likewise.
> * gcc.dg/vect/vect-95.c: Likewise.
> * gcc.dg/vect/vect-96.c: Likewise.
> * gcc.dg/vect/vect-cond-1.c: Likewise.
> * gcc.dg/vect/vect-cond-3.c: Likewise.
> * gcc.dg/vect/vect-cond-4.c: Likewise.
> * gcc.dg/vect/vect-cselim-1.c: Likewise.
> * gcc.dg/vect/vect-multitypes-1.c: Likewise.
> * gcc.dg/vect/vect-multitypes-3.c: Likewise.This would be better implemented as a test of a new variable set in option_override instead of explicitly testing POWER8 because
> * gcc.dg/vect/vect-multitypes-4.c: Likewise.
> * gcc.dg/vect/vect-multitypes-6.c: Likewise.
> * gcc.dg/vect/vect-nest-cycle-1.c: Likewise.
> * gcc.dg/vect/vect-nest-cycle-2.c: Likewise.
> * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.
> * gcc.dg/vect/vect-outer-3a.c: Likewise.
> * gcc.dg/vect/vect-outer-5.c: Likewise.
> * gcc.dg/vect/vect-outer-fir-big-array.c: Likewise.
> * gcc.dg/vect/vect-outer-fir-lb-big-array.c: Likewise.
> * gcc.dg/vect/vect-outer-fir-lb.c: Likewise.
> * gcc.dg/vect/vect-outer-fir.c: Likewise.
> * gcc.dg/vect/vect-peel-3.c: Likewise.
> * gcc.dg/vect/vect-peel-4.c: Likewise.
> * gcc.dg/vect/vect-pre-interact.c: Likewise.
> * gcc.target/powerpc/vsx-vectorize-2.c: Exclude test for POWER8.
> * gcc.target/powerpc/vsx-vectorize-4.c: Likewise.
> * gcc.target/powerpc/vsx-vectorize-6.c: LikewisThis would be better implemented as a test of a new variable set in option_override instead of explicitly testing POWER8 becausee.
> * gcc.target/powerpc/vsx-vectorize-7.c: Likewise.
> * gfortran.dg/vect/vect-2.f90: Replace vect_no_align with
> vect_no_align && { ! vect_hw_misalign }.
> * gfortran.dg/vect/vect-3.f90: Likewise.
> * gfortran.dg/vect/vect-4.f90: Likewise.
> * gfortran.dg/vect/vect-5.f90: Likewise.
> * lib/target-supports.exp (check_effective_target_vect_no_align):
> Return 1 for POWER8.
> (check_effective_target_vect_hw_misalign): Return 1 for POWER8.
This is a reasonable change, but please ask a vect maintainer like
Richi or a testsuite maintainer to approve.
Thanks, David
More information about the Gcc-patches
mailing list