patch for elimination to SP when it is changed in RTL (PR57293)

Vladimir Makarov vmakarov@redhat.com
Fri Nov 29 11:10:00 GMT 2013


On 11/28/2013, 7:51 PM, H.J. Lu wrote:
> On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>    The following patch fixes PR57293
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293
>>
>>    It is actually an implementation of missed LRA functionality in reg
>> elimination.  Before the patch any explicit change of stack pointer in
>> RTL resulted in necessity to use the frame pointer.
>>
>>    The patch has practically no effect on generic tuning of x86/x86-64.
>> But it has a dramatic effect on code performance for other tunings
>> like corei7 which don't use incoming args accumulation.  The maximum
>> SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000.  But
>> SPECFP2000 rate also has improvement about 1% on x86 and x86-64.  Too
>> bad that I did not implement it at the first place.  The results would
>> have been even much better ones reported on 2012 GNU Cauldron as I
>> also used -mtune=corei7 that time.
>>
>> The patch was bootstrapped and tested on x86-64/x86 and ppc.
>>
>> Committed as rev. 205498.
>>
>>   2013-11-28  Vladimir Makarov<vmakarov@redhat.com>
>>
>>          PR target/57293
>>          * ira.h (ira_setup_eliminable_regset): Remove parameter.
>>          * ira.c (ira_setup_eliminable_regset): Ditto.  Add
>>          SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
>>          Don't call lra_init_elimination.
>>          (ira): Call ira_setup_eliminable_regset without arguments.
>>          * loop-invariant.c (calculate_loop_reg_pressure): Remove argument
>>          from ira_setup_eliminable_regset call.
>>          * gcse.c (calculate_bb_reg_pressure): Ditto.
>>          * haifa-sched.c (sched_init): Ditto.
>>          * lra.h (lra_init_elimination): Remove the prototype.
>>          * lra-int.h (lra_insn_recog_data): New member sp_offset.  Move
>>          used_insn_alternative upper.
>>          (lra_eliminate_regs_1): Add one more parameter.
>>          (lra-eliminate): Ditto.
>>          * lra.c (lra_invalidate_insn_data): Set sp_offset.
>>          (setup_sp_offset): New.
>>          (lra_process_new_insns): Call setup_sp_offset.
>>          (lra): Add argument to lra_eliminate calls.
>>          * lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
>>          (get_equiv_with_elimination): New.
>>          (process_addr_reg): Call get_equiv_with_elimination instead of
>>          get_equiv_substitution.
>>          (equiv_address_substitution): Ditto.
>>          (loc_equivalence_change_p): Ditto.
>>          (loc_equivalence_callback, lra_constraints): Ditto.
>>          (curr_insn_transform): Ditto.  Print the sp offset
>>          (process_alt_operands): Prevent stack pointer reloads.
>>          (lra_constraints): Remove one argument from lra_eliminate call.
>>          Move it up.  Mark used hard regs bfore it.  Use
>>          get_equiv_with_elimination instead of get_equiv_substitution.
>>          * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
>>          assert for param values combination.  Use sp offset.  Add argument
>>          to lra_eliminate_regs_1 calls.
>>          (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
>>          (curr_sp_change): New static var.
>>          (mark_not_eliminable): Add parameter.  Update curr_sp_change.
>>          Don't prevent elimination to sp if we can calculate its change.
>>          Pass the argument to mark_not_eliminable calls.
>>          (eliminate_regs_in_insn): Add a parameter.  Use sp offset.  Add
>>          argument to lra_eliminate_regs_1 call.
>>          (update_reg_eliminate): Move calculation of hard regs for spill
>>          lower.  Switch off lra_in_progress temporarily to generate regs
>>          involved into elimination.
>>          (lra_init_elimination): Rename to init_elimination.  Make it
>>          static.  Set up insn sp offset, check the offsets at the end of
>>          BBs.
>>          (process_insn_for_elimination): Add parameter.  Pass its value to
>>          eliminate_regs_in_insn.
>>          (lra_eliminate): : Add parameter.  Pass its value to
>>          process_insn_for_elimination.  Add assert for param values
>>          combination.  Call init_elimination.  Don't update offsets in
>>          equivalence substitutions.
>>          * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
>>          for created stack slot.
>>          (remove_pseudos): Call lra_eliminate_regs_1 before changing memory
>>          onto stack slot.
>>
>
> Hi Vladimir,
>
> Thanks for your hard work.   I noticed a few regressions
> on x86-64:
>
> FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 x == 36
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 x == 98
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 z == 8
> FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
> -funroll-all-loops -finline-functions  line 21 i == v + 1
> FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
> -funroll-loops  line 21 i == v + 1
> FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
>

Well, I guess it can happen when you don't use frame-pointer anymore.

> Besides guality failures, there is
>
> [hjl@gnu-6 gcc]$ make check-gcc
> RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}'
> i386.exp=pr9771-1.c"
> make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
> test -d plugin || mkdir plugin
> test -d testsuite || mkdir testsuite
> test -d testsuite/gcc || mkdir testsuite/gcc
> (rootme=`${PWDCMD-pwd}`; export rootme; \
> srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \
> cd testsuite/gcc; \
> rm -f tmp-site.exp; \
> sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \
>      < ../../site.exp > tmp-site.exp; \
> /bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \
> EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo
> ${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ;
> \
> if [ -f ${rootme}/../expect/expect ] ; then  \
>     TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \
>      export TCL_LIBRARY ; fi ; \
> runtestflags= ; \
> if [ -n "" ] ; then \
>    runtestflags=""; \
> elif [ -n "" ] ; then \
>    parts="`echo '  ' \
>        | sed 's/=[^ ]* / /g'`"; \
>    for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \
>      part=`basename $part` ; \
>      case " $parts $runtestflags " in \
>        *" $part "*) ;; \
>        *) runtestflags="$runtestflags $part" ;; \
>      esac ; \
>    done ; \
> fi ; \
> `if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo
> ${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc
> --target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c
> $runtestflags)
> WARNING: Couldn't find the global config file.
> Test Run By hjl on Thu Nov 28 16:50:05 2013
> Native configuration is x86_64-unknown-linux-gnu
>
>          === gcc tests ===
>
> Schedule of variations:
>      unix/-m32 -march=corei7
>
> Running target unix/-m32 -march=corei7
> Using /usr/share/dejagnu/baseboards/unix.exp as board description file
> for target.
> Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
> Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as
> tool-and-target-specific interface file.
> Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ...
> FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
> FAIL: gcc.target/i386/pr9771-1.c (test for excess errors)
>

May be it is wrong to use this test with this tuning as reload also 
fails on this test when -march=corei7 is used.  Or most probably the 
problem is somewhere else.  The patch just triggered it.


>          === gcc Summary ===
>
> # of expected passes        10
> # of unexpected failures    2
> # of unresolved testcases    1
>
> "-m32 -march=corei7" triggers the ICE.
>
>



More information about the Gcc-patches mailing list