This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch for elimination to SP when it is changed in RTL (PR57293)


On 11/28/2013, 7:51 PM, H.J. Lu wrote:
On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
   The following patch fixes PR57293

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293

   It is actually an implementation of missed LRA functionality in reg
elimination.  Before the patch any explicit change of stack pointer in
RTL resulted in necessity to use the frame pointer.

   The patch has practically no effect on generic tuning of x86/x86-64.
But it has a dramatic effect on code performance for other tunings
like corei7 which don't use incoming args accumulation.  The maximum
SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000.  But
SPECFP2000 rate also has improvement about 1% on x86 and x86-64.  Too
bad that I did not implement it at the first place.  The results would
have been even much better ones reported on 2012 GNU Cauldron as I
also used -mtune=corei7 that time.

The patch was bootstrapped and tested on x86-64/x86 and ppc.

Committed as rev. 205498.

  2013-11-28  Vladimir Makarov<vmakarov@redhat.com>

         PR target/57293
         * ira.h (ira_setup_eliminable_regset): Remove parameter.
         * ira.c (ira_setup_eliminable_regset): Ditto.  Add
         SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
         Don't call lra_init_elimination.
         (ira): Call ira_setup_eliminable_regset without arguments.
         * loop-invariant.c (calculate_loop_reg_pressure): Remove argument
         from ira_setup_eliminable_regset call.
         * gcse.c (calculate_bb_reg_pressure): Ditto.
         * haifa-sched.c (sched_init): Ditto.
         * lra.h (lra_init_elimination): Remove the prototype.
         * lra-int.h (lra_insn_recog_data): New member sp_offset.  Move
         used_insn_alternative upper.
         (lra_eliminate_regs_1): Add one more parameter.
         (lra-eliminate): Ditto.
         * lra.c (lra_invalidate_insn_data): Set sp_offset.
         (setup_sp_offset): New.
         (lra_process_new_insns): Call setup_sp_offset.
         (lra): Add argument to lra_eliminate calls.
         * lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
         (get_equiv_with_elimination): New.
         (process_addr_reg): Call get_equiv_with_elimination instead of
         get_equiv_substitution.
         (equiv_address_substitution): Ditto.
         (loc_equivalence_change_p): Ditto.
         (loc_equivalence_callback, lra_constraints): Ditto.
         (curr_insn_transform): Ditto.  Print the sp offset
         (process_alt_operands): Prevent stack pointer reloads.
         (lra_constraints): Remove one argument from lra_eliminate call.
         Move it up.  Mark used hard regs bfore it.  Use
         get_equiv_with_elimination instead of get_equiv_substitution.
         * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
         assert for param values combination.  Use sp offset.  Add argument
         to lra_eliminate_regs_1 calls.
         (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
         (curr_sp_change): New static var.
         (mark_not_eliminable): Add parameter.  Update curr_sp_change.
         Don't prevent elimination to sp if we can calculate its change.
         Pass the argument to mark_not_eliminable calls.
         (eliminate_regs_in_insn): Add a parameter.  Use sp offset.  Add
         argument to lra_eliminate_regs_1 call.
         (update_reg_eliminate): Move calculation of hard regs for spill
         lower.  Switch off lra_in_progress temporarily to generate regs
         involved into elimination.
         (lra_init_elimination): Rename to init_elimination.  Make it
         static.  Set up insn sp offset, check the offsets at the end of
         BBs.
         (process_insn_for_elimination): Add parameter.  Pass its value to
         eliminate_regs_in_insn.
         (lra_eliminate): : Add parameter.  Pass its value to
         process_insn_for_elimination.  Add assert for param values
         combination.  Call init_elimination.  Don't update offsets in
         equivalence substitutions.
         * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
         for created stack slot.
         (remove_pseudos): Call lra_eliminate_regs_1 before changing memory
         onto stack slot.


Hi Vladimir,

Thanks for your hard work.   I noticed a few regressions
on x86-64:

FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 z == 8
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  line 21 i == v + 1
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-loops  line 21 i == v + 1
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)


Well, I guess it can happen when you don't use frame-pointer anymore.

Besides guality failures, there is

[hjl@gnu-6 gcc]$ make check-gcc
RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}'
i386.exp=pr9771-1.c"
make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
test -d plugin || mkdir plugin
test -d testsuite || mkdir testsuite
test -d testsuite/gcc || mkdir testsuite/gcc
(rootme=`${PWDCMD-pwd}`; export rootme; \
srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \
cd testsuite/gcc; \
rm -f tmp-site.exp; \
sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \
     < ../../site.exp > tmp-site.exp; \
/bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \
EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo
${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ;
\
if [ -f ${rootme}/../expect/expect ] ; then  \
    TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \
     export TCL_LIBRARY ; fi ; \
runtestflags= ; \
if [ -n "" ] ; then \
   runtestflags=""; \
elif [ -n "" ] ; then \
   parts="`echo '  ' \
       | sed 's/=[^ ]* / /g'`"; \
   for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \
     part=`basename $part` ; \
     case " $parts $runtestflags " in \
       *" $part "*) ;; \
       *) runtestflags="$runtestflags $part" ;; \
     esac ; \
   done ; \
fi ; \
`if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo
${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc
--target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c
$runtestflags)
WARNING: Couldn't find the global config file.
Test Run By hjl on Thu Nov 28 16:50:05 2013
Native configuration is x86_64-unknown-linux-gnu

         === gcc tests ===

Schedule of variations:
     unix/-m32 -march=corei7

Running target unix/-m32 -march=corei7
Using /usr/share/dejagnu/baseboards/unix.exp as board description file
for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ...
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
FAIL: gcc.target/i386/pr9771-1.c (test for excess errors)


May be it is wrong to use this test with this tuning as reload also fails on this test when -march=corei7 is used. Or most probably the problem is somewhere else. The patch just triggered it.


         === gcc Summary ===

# of expected passes        10
# of unexpected failures    2
# of unresolved testcases    1

"-m32 -march=corei7" triggers the ICE.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]