This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, powerpc] PR 58160 -- fix power8 gpr load fusion to be safer
- From: David Edelsohn <dje dot gcc at gmail dot com>
- To: Michael Meissner <meissner at linux dot vnet dot ibm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 15 Aug 2013 20:10:13 -0400
- Subject: Re: [PATCH, powerpc] PR 58160 -- fix power8 gpr load fusion to be safer
- References: <20130814225526 dot GA7648 at ibm-tiger dot the-meissners dot org>
On Wed, Aug 14, 2013 at 6:55 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> In running the spec 2006 testsuite with several options, we discovered that
> -mtune=power8 (which turns on the -mpower8-fusion option) has a segmentation
> fault in two benchmarks:
>
> 403.gcc when built with -O2 -mcpu=power7 -mtune=power8 -m32
> 435.gromacs when built with -O3 -funroll-loops -mcpu=power7 -mtune=power8 -m32
>
> In the gcc case, I tracked it down to the fusion op in the function
> strength_reduce in the file loop.c. It was wanting to use the result of the
> addis instruction after the loop.
>
> It wanted to optimize:
> lis 10,loop_dump_stream@ha
> lwz 6,loop_dump_stream@l(6)
> stw 10,24(1)
>
> to:
> lis 6,loop_dump_stream@ha
> lwz 6,loop_dump_stream@l(6)
> stw 10,24(1)
>
> The problem is fusion was implemented using define_peephole, and the register
> live notes are not correct when peepholes are done.
>
> I have written the following patch, and bootstrapped the compiler using
> BOOT_CFLAGS='-g -O2 -mcpu=power7 -mtune=power8' to stress the code. There were
> no regressions in make check. Are these patches ok to install in the tree?
>
> I have the code for the function that causes the problem narrowed down to the
> function that generates the bad code. However, I'm worried that this test will
> be very fragile, and it will need to be updated as the compiler changes.
>
> The compiler should be fixed so that it won't be tempted to spill the ELF_HIGH
> for the addis on the stack, so that in the future, it could safely fusion the
> addis and load operation.
>
> [gcc]
> 2013-08-14 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> PR target/58160
> * config/rs6000/predicates.md (fusion_gpr_mem_load): Allow the
> memory rtx to contain ZERO_EXTEND and SIGN_EXTEND.
>
> * config/rs6000/rs6000-protos.h (fusion_gpr_load_p): Pass operands
> array instead of each individual operand as a separate argument.
> (emit_fusion_gpr_load): Likewise.
> (expand_fusion_gpr_load): Add new function declaration.
>
> * config/rs6000/rs6000.c (fusion_gpr_load_p): Change the calling
> signature to have the operands passed as an array, instead of as
> separate arguments. Allow ZERO_EXTEND to be in the memory
> address, and also SIGN_EXTEND if -mpower8-fusion-sign. Do not
> depend on the register live/dead flags when peepholes are run.
> (expand_fusion_gpr_load): New function to be called from the
> peephole2 pass, to change the register that addis sets to be the
> target register.
> (emit_fusion_gpr_load): Change the calling signature to have the
> operands passed as an array, instead of as separate arguments.
> Allow ZERO_EXTEND to be in the memory address, and also
> SIGN_EXTEND if -mpower8-fusion-sign.
>
> * config/rs6000/rs6000.md (UNSPEC_FUSION_GPR): Delete unused
> unspec enumeration.
> (power8 fusion peephole/peephole2): Rework the fusion peepholes to
> adjust the register addis loads up in the peephole2 pass. Do not
> depend on the register live/dead state when the peephole pass is
> done.
This is okay.
Thanks, David