patch for elimination to SP when it is changed in RTL (PR57293)
Vladimir Makarov
vmakarov@redhat.com
Fri Nov 29 11:10:00 GMT 2013
On 11/28/2013, 7:51 PM, H.J. Lu wrote:
> On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>> The following patch fixes PR57293
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293
>>
>> It is actually an implementation of missed LRA functionality in reg
>> elimination. Before the patch any explicit change of stack pointer in
>> RTL resulted in necessity to use the frame pointer.
>>
>> The patch has practically no effect on generic tuning of x86/x86-64.
>> But it has a dramatic effect on code performance for other tunings
>> like corei7 which don't use incoming args accumulation. The maximum
>> SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000. But
>> SPECFP2000 rate also has improvement about 1% on x86 and x86-64. Too
>> bad that I did not implement it at the first place. The results would
>> have been even much better ones reported on 2012 GNU Cauldron as I
>> also used -mtune=corei7 that time.
>>
>> The patch was bootstrapped and tested on x86-64/x86 and ppc.
>>
>> Committed as rev. 205498.
>>
>> 2013-11-28 Vladimir Makarov<vmakarov@redhat.com>
>>
>> PR target/57293
>> * ira.h (ira_setup_eliminable_regset): Remove parameter.
>> * ira.c (ira_setup_eliminable_regset): Ditto. Add
>> SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
>> Don't call lra_init_elimination.
>> (ira): Call ira_setup_eliminable_regset without arguments.
>> * loop-invariant.c (calculate_loop_reg_pressure): Remove argument
>> from ira_setup_eliminable_regset call.
>> * gcse.c (calculate_bb_reg_pressure): Ditto.
>> * haifa-sched.c (sched_init): Ditto.
>> * lra.h (lra_init_elimination): Remove the prototype.
>> * lra-int.h (lra_insn_recog_data): New member sp_offset. Move
>> used_insn_alternative upper.
>> (lra_eliminate_regs_1): Add one more parameter.
>> (lra-eliminate): Ditto.
>> * lra.c (lra_invalidate_insn_data): Set sp_offset.
>> (setup_sp_offset): New.
>> (lra_process_new_insns): Call setup_sp_offset.
>> (lra): Add argument to lra_eliminate calls.
>> * lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
>> (get_equiv_with_elimination): New.
>> (process_addr_reg): Call get_equiv_with_elimination instead of
>> get_equiv_substitution.
>> (equiv_address_substitution): Ditto.
>> (loc_equivalence_change_p): Ditto.
>> (loc_equivalence_callback, lra_constraints): Ditto.
>> (curr_insn_transform): Ditto. Print the sp offset
>> (process_alt_operands): Prevent stack pointer reloads.
>> (lra_constraints): Remove one argument from lra_eliminate call.
>> Move it up. Mark used hard regs bfore it. Use
>> get_equiv_with_elimination instead of get_equiv_substitution.
>> * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
>> assert for param values combination. Use sp offset. Add argument
>> to lra_eliminate_regs_1 calls.
>> (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
>> (curr_sp_change): New static var.
>> (mark_not_eliminable): Add parameter. Update curr_sp_change.
>> Don't prevent elimination to sp if we can calculate its change.
>> Pass the argument to mark_not_eliminable calls.
>> (eliminate_regs_in_insn): Add a parameter. Use sp offset. Add
>> argument to lra_eliminate_regs_1 call.
>> (update_reg_eliminate): Move calculation of hard regs for spill
>> lower. Switch off lra_in_progress temporarily to generate regs
>> involved into elimination.
>> (lra_init_elimination): Rename to init_elimination. Make it
>> static. Set up insn sp offset, check the offsets at the end of
>> BBs.
>> (process_insn_for_elimination): Add parameter. Pass its value to
>> eliminate_regs_in_insn.
>> (lra_eliminate): : Add parameter. Pass its value to
>> process_insn_for_elimination. Add assert for param values
>> combination. Call init_elimination. Don't update offsets in
>> equivalence substitutions.
>> * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
>> for created stack slot.
>> (remove_pseudos): Call lra_eliminate_regs_1 before changing memory
>> onto stack slot.
>>
>
> Hi Vladimir,
>
> Thanks for your hard work. I noticed a few regressions
> on x86-64:
>
> FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin
> -flto-partition=none line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin
> -flto-partition=none line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 z == 8
> FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer
> -funroll-all-loops -finline-functions line 21 i == v + 1
> FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer
> -funroll-loops line 21 i == v + 1
> FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
>
Well, I guess it can happen when you don't use frame-pointer anymore.
> Besides guality failures, there is
>
> [hjl@gnu-6 gcc]$ make check-gcc
> RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}'
> i386.exp=pr9771-1.c"
> make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
> test -d plugin || mkdir plugin
> test -d testsuite || mkdir testsuite
> test -d testsuite/gcc || mkdir testsuite/gcc
> (rootme=`${PWDCMD-pwd}`; export rootme; \
> srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \
> cd testsuite/gcc; \
> rm -f tmp-site.exp; \
> sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \
> < ../../site.exp > tmp-site.exp; \
> /bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \
> EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo
> ${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ;
> \
> if [ -f ${rootme}/../expect/expect ] ; then \
> TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \
> export TCL_LIBRARY ; fi ; \
> runtestflags= ; \
> if [ -n "" ] ; then \
> runtestflags=""; \
> elif [ -n "" ] ; then \
> parts="`echo ' ' \
> | sed 's/=[^ ]* / /g'`"; \
> for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \
> part=`basename $part` ; \
> case " $parts $runtestflags " in \
> *" $part "*) ;; \
> *) runtestflags="$runtestflags $part" ;; \
> esac ; \
> done ; \
> fi ; \
> `if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo
> ${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc
> --target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c
> $runtestflags)
> WARNING: Couldn't find the global config file.
> Test Run By hjl on Thu Nov 28 16:50:05 2013
> Native configuration is x86_64-unknown-linux-gnu
>
> === gcc tests ===
>
> Schedule of variations:
> unix/-m32 -march=corei7
>
> Running target unix/-m32 -march=corei7
> Using /usr/share/dejagnu/baseboards/unix.exp as board description file
> for target.
> Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
> Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as
> tool-and-target-specific interface file.
> Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ...
> FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
> FAIL: gcc.target/i386/pr9771-1.c (test for excess errors)
>
May be it is wrong to use this test with this tuning as reload also
fails on this test when -march=corei7 is used. Or most probably the
problem is somewhere else. The patch just triggered it.
> === gcc Summary ===
>
> # of expected passes 10
> # of unexpected failures 2
> # of unresolved testcases 1
>
> "-m32 -march=corei7" triggers the ICE.
>
>
More information about the Gcc-patches
mailing list