[PATCH v2] rs6000: Optimize __builtin_shuffle when it's used to zero the upper bits [PR102868]
David Edelsohn
dje.gcc@gmail.com
Thu Oct 28 15:00:30 GMT 2021
On Thu, Oct 28, 2021 at 1:39 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> On 2021/10/27 21:24, David Edelsohn wrote:
> > On Sun, Oct 24, 2021 at 10:51 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
> >>
> >> If the second operand of __builtin_shuffle is const vector 0, and with
> >> specific mask, it can be optimized to vspltisw+xxpermdi instead of lxv.
> >>
> >> gcc/ChangeLog:
> >>
> >> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Add
> >> patterns match and emit for VSX xxpermdi.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> * gcc.target/powerpc/pr102868.c: New test.
> >> ---
> >> gcc/config/rs6000/rs6000.c | 47 ++++++++++++++++--
> >> gcc/testsuite/gcc.target/powerpc/pr102868.c | 53 +++++++++++++++++++++
> >> 2 files changed, 97 insertions(+), 3 deletions(-)
> >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102868.c
> >>
> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> >> index d0730253bcc..5d802c1fa96 100644
> >> --- a/gcc/config/rs6000/rs6000.c
> >> +++ b/gcc/config/rs6000/rs6000.c
> >> @@ -23046,7 +23046,23 @@ altivec_expand_vec_perm_const (rtx target, rtx op0, rtx op1,
> >> {OPTION_MASK_P8_VECTOR,
> >> BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct
> >> : CODE_FOR_p8_vmrgew_v4sf_direct,
> >> - {4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31}}};
> >> + {4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31}},
> >> + {OPTION_MASK_VSX,
> >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi
> >> + : CODE_FOR_vsx_xxpermdi_v16qi),
> >> + {0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23}},
> >> + {OPTION_MASK_VSX,
> >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi
> >> + : CODE_FOR_vsx_xxpermdi_v16qi),
> >> + {8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23}},
> >> + {OPTION_MASK_VSX,
> >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi
> >> + : CODE_FOR_vsx_xxpermdi_v16qi),
> >> + {0, 1, 2, 3, 4, 5, 6, 7, 24, 25, 26, 27, 28, 29, 30, 31}},
> >> + {OPTION_MASK_VSX,
> >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi
> >> + : CODE_FOR_vsx_xxpermdi_v16qi),
> >> + {8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31}}};
> >
> > If the insn_code is the same for big endian and little endian, why
> > does the new code test BYTES_BIG_ENDIAN to set the same value
> > (CODE_FOR_vsx_xxpermdi_v16qi)?
> >
>
> Thanks for the catch, updated the patch as below:
>
> [PATCH v2] rs6000: Optimize __builtin_shuffle when it's used to zero the upper bits [PR102868]
>
> If the second operand of __builtin_shuffle is const vector 0, and with
> specific mask, it can be optimized to vspltisw+xxpermdi instead of lxv.
>
> gcc/ChangeLog:
>
> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Add
> patterns match and emit for VSX xxpermdi.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/powerpc/pr102868.c: New test.
Okay.
Thanks, David
More information about the Gcc-patches
mailing list