This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, powerpc] PR 58160 -- fix power8 gpr load fusion to be safer


In running the spec 2006 testsuite with several options, we discovered that
-mtune=power8 (which turns on the -mpower8-fusion option) has a segmentation
fault in two benchmarks:

403.gcc when built with -O2 -mcpu=power7 -mtune=power8 -m32
435.gromacs when built with -O3 -funroll-loops -mcpu=power7 -mtune=power8 -m32

In the gcc case, I tracked it down to the fusion op in the function
strength_reduce in the file loop.c.  It was wanting to use the result of the
addis instruction after the loop.

It wanted to optimize:
        lis 10,loop_dump_stream@ha
        lwz 6,loop_dump_stream@l(6)
        stw 10,24(1)

to:
        lis 6,loop_dump_stream@ha
        lwz 6,loop_dump_stream@l(6)
        stw 10,24(1)

The problem is fusion was implemented using define_peephole, and the register
live notes are not correct when peepholes are done.

I have written the following patch, and bootstrapped the compiler using
BOOT_CFLAGS='-g -O2 -mcpu=power7 -mtune=power8' to stress the code.  There were
no regressions in make check.  Are these patches ok to install in the tree?

I have the code for the function that causes the problem narrowed down to the
function that generates the bad code.  However, I'm worried that this test will
be very fragile, and it will need to be updated as the compiler changes.

The compiler should be fixed so that it won't be tempted to spill the ELF_HIGH
for the addis on the stack, so that in the future, it could safely fusion the
addis and load operation.

[gcc]
2013-08-14  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/58160
	* config/rs6000/predicates.md (fusion_gpr_mem_load): Allow the
	memory rtx to contain ZERO_EXTEND and SIGN_EXTEND.

	* config/rs6000/rs6000-protos.h (fusion_gpr_load_p): Pass operands
	array instead of each individual operand as a separate argument.
	(emit_fusion_gpr_load): Likewise.
	(expand_fusion_gpr_load): Add new function declaration.

	* config/rs6000/rs6000.c (fusion_gpr_load_p): Change the calling
	signature to have the operands passed as an array, instead of as
	separate arguments.  Allow ZERO_EXTEND to be in the memory
	address, and also SIGN_EXTEND if -mpower8-fusion-sign.  Do not
	depend on the register live/dead flags when peepholes are run.
	(expand_fusion_gpr_load): New function to be called from the
	peephole2 pass, to change the register that addis sets to be the
	target register.
	(emit_fusion_gpr_load): Change the calling signature to have the
	operands passed as an array, instead of as separate arguments.
	Allow ZERO_EXTEND to be in the memory address, and also
	SIGN_EXTEND if -mpower8-fusion-sign.

	* config/rs6000/rs6000.md (UNSPEC_FUSION_GPR): Delete unused
	unspec enumeration.
	(power8 fusion peephole/peephole2): Rework the fusion peepholes to
	adjust the register addis loads up in the peephole2 pass.  Do not
	depend on the register live/dead state when the peephole pass is
	done.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: pr58160.patch01b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]