This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH,rs6000] Add pass to optimize away xxpermdi's from vector computations
- From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
- To: David Edelsohn <dje dot gcc at gmail dot com>
- Cc: Steven Bosscher <stevenb dot gcc at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sun, 17 Aug 2014 19:39:09 -0500
- Subject: Re: [PATCH,rs6000] Add pass to optimize away xxpermdi's from vector computations
- Authentication-results: sourceware.org; auth=none
- References: <1407971685 dot 3063 dot 204 dot camel at gnopaine> <CAGWvnykjqhoHBXicOt3_6Gc5CjQOCYS-Q2J4=Jkau+=nBeRhhA at mail dot gmail dot com>
On Sun, 2014-08-17 at 14:52 -0400, David Edelsohn wrote:
> On Wed, Aug 13, 2014 at 7:14 PM, Bill Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> > Hi,
> >
> > This patch adds a PowerPC-specific pass just prior to the first cse RTL
> > pass. The pass runs only when generating little-endian code for Power8
> > with VSX enabled, and for -O1 and up. For this particular subtarget,
> > the use of the big-endian-biased vector load and store instructions
> > requires permutations to order vector elements for little endian. To
> > reduce the overhead of these permutations, this pass looks for
> > computations for which the exact lanes in which computations are
> > performed does not matter, so long as the results are returned to
> > storage in the proper order. For such computations we can remove the
> > xxpermdi's associated with the vector loads and stores.
> >
> > This patch relies on another patch posted today that converts a struct
> > used by the web pass into a base class that this patch can subclass. If
> > it's determined that the other patch isn't appropriate, then this patch
> > will need modifications to duplicate the union-find logic.
> >
> > A complete description of the new pass appears in rs6000.c (search for
> > "Analyze vector computations"). That description also identifies some
> > remaining opportunities we can follow up with later.
> >
> > A number of new tests are added to verify that the pass works as
> > expected for some vectorized code samples.
> >
> > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> > regressions. Is this ok for trunk?
> >
> > Thanks,
> > Bill
> >
> >
> > [gcc]
> >
> > 2014-08-13 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
> >
> > * config/rs6000/rs6000.c (context.h): New include.
> > (tree-pass.h): Likewise.
> > (make_pass_analyze_swaps): New decl.
> > (rs6000_option_override): Register pass_analyze_swaps.
> > (swap_web_entry): New subsclass of web_entry_base (df.h).
> > (special_handling_values): New enum.
> > (union_defs): New function.
> > (union_uses): Likewise.
> > (insn_is_load_p): Likewise.
> > (insn_is_store_p): Likewise.
> > (insn_is_swap_p): Likewise.
> > (rtx_is_swappable_p): Likewise.
> > (insn_is_swappable_p): Likewise.
> > (chain_purpose): New enum.
> > (chain_contains_only_swaps): New function.
> > (mark_swaps_for_removal): Likewise.
> > (swap_const_vector_halves): Likewise.
> > (adjust_subreg_index): Likewise.
> > (permute_load): Likewise.
> > (permute_store): Likewise.
> > (handle_special_swappables): Likewise.
> > (replace_swap_with_copy): Likewise.
> > (dump_swap_insn_table): Likewise.
> > (rs6000_analyze_swaps): Likewise.
> > (pass_data_analyze_swaps): New pass_data.
> > (pass_analyze_swaps): New rtl_opt_pass.
> > (make_pass_analyze_swaps): New function.
> > * config/rs6000/rs6000.opt (moptimize-swaps): New option.
> >
> > [gcc/testsuite]
> >
> > 2014-08-13 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
> >
> > * gcc.target/powerpc/swaps-p8-1.c: New test.
> > * gcc.target/powerpc/swaps-p8-2.c: New test.
> > * gcc.target/powerpc/swaps-p8-3.c: New test.
> > * gcc.target/powerpc/swaps-p8-4.c: New test.
> > * gcc.target/powerpc/swaps-p8-5.c: New test.
> > * gcc.target/powerpc/swaps-p8-6.c: New test.
> > * gcc.target/powerpc/swaps-p8-7.c: New test.
> > * gcc.target/powerpc/swaps-p8-8.c: New test.
> > * gcc.target/powerpc/swaps-p8-9.c: New test.
> > * gcc.target/powerpc/swaps-p8-10.c: New test.
> > * gcc.target/powerpc/swaps-p8-11.c: New test.
> > * gcc.target/powerpc/swaps-p8-12.c: New test.
>
> This looks okay, although I was hoping that other developers with more
> DF and web experience would double check.
Thanks for the review!
> Why are you specifically gating the pass on POWER8?
The problem is introduced in POWER8 (since LE isn't supported earlier),
and I hope that this will no longer be necessary for -mcpu=power9. If
for some reason this doesn't turn out to be the case, we'll need to make
a change at that time, I suppose...
Thanks,
Bill
>
> Thanks, David
>