This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Mostly rewrite genrecog


On Mon, Apr 27, 2015 at 6:20 PM, Richard Sandiford
<richard.sandiford@arm.com> wrote:
> I think it's been the case for a while that parallel builds of GCC tend
> to serialise around the compilation of insn-recog.c, especially with
> higher --enable-checking settings.  This patch tries to speed that
> up by replacing most of genrecog with a new algorithm.
>
> I think the main problems with the current code are:
>
> 1. Vector architectures have added lots of new instructions that have
>    a similar shape and differ only in mode, code or unspec number.
>    The current algorithm doesn't have any way of factoring out those
>    similarities.
>
> 2. When matching a particular instruction, the current code examines
>    everything about a SET_DEST before moving on to the SET_SRC.  This has
>    two subproblems:
>
>    2a. The destination of a SET isn't very distinctive.  It's usually
>        just a register_operand, a memory_operand, a nonimmediate_operand
>        or a flags register.  We therefore tend to backtrack to the
>        SET_DEST a lot, oscillating between groups of instructions with
>        the same kind of destination.
>
>    2b. Backtracking through predicate checks is relatively expensive.
>        It would be good to narrow down the "shape" of the instruction
>        first and only then check the predicates.  (The backtracking is
>        expensive in terms of insn-recog.o compile time too, both because
>        we need to copy into argument registers and out of the result
>        register, and because it adds more sites where spills are needed.)
>
> 3. The code keeps one local variable per rtx depth, so it ends up
>    loading the same rtx many times over (mostly when backtracking).
>    This is very expensive in rtl-checking builds because each XEXP
>    includes a code check and a line-specific failure call.
>
>    In principle the idea of having one local variable per depth
>    is good.  But it was originally written that way when all optimisations
>    were done at the rtl level and I imagine each local variable mapped
>    to one pseudo register.  These days the statements that reload the
>    value needed on backtracking lead to many more SSA names and phi
>    statements than you'd get with just a single variable per position
>    (loaded once, so naturally SSA).  There is still the potential benefit
>    of avoiding having sibling rtxes live at once, but fixing (2) above
>    reduces that problem.
>
> Also, the code is all goto-based, which makes it rather hard to step through.
>
> The patch deals with these as follows:
>
> 1. Detect subpatterns that differ only by mode, code and/or integer
>    (e.g. unspec number) and split them out into a common routine.
>
> 2. Match the "shape" of the instruction first, in terms of codes,
>    integers and vector lengths, and only then check the modes, predicates
>    and dups.  When checking the shape, handle SET_SRCs before SET_DESTs.
>    In practice this seems to greatly reduce the amount of backtracking.
>
> 3. Have one local variable per rtx position.  I tested the patch with
>    and without the change and it helped a lot with rtl-checking builds
>    without seeming to affect release builds much either way.
>
> As far as debuggability goes, the new code avoids gotos and just
> uses "natural" control flow.
>
> The headline stat is that a stage 3 --enable-checking=yes,rtl,df
> build of insn-recog.c on my box goes from 7m43s to 2m2s (using the
> same stage 2 compiler).  The corresponding --enable-checking=release
> change is from 49s to 24s (less impressive, as expected).
>
> The patch seems to speed up recog.  E.g. the time taken to build
> fold-const.ii goes from 6.74s before the patch to 6.69s after it;
> not a big speed-up, but reproducible.
>
> Here's a comparison of the number of lines of code in insn-recog.c
> before and after the patch on one target per config/ CPU:
>
> aarch64-linux-gnueabi                           115526    38169 :   33.04%
> alpha-linux-gnu                                  24479    10740 :   43.87%
> arm-linux-gnueabi                               169208    67759 :   40.04%
> avr-rtems                                        55647    22127 :   39.76%
> bfin-elf                                         13928     6498 :   46.65%
> c6x-elf                                          29928    13324 :   44.52%
> cr16-elf                                          2650     1419 :   53.55%
> cris-elf                                         18669     7257 :   38.87%
> epiphany-elf                                     19308     6131 :   31.75%
> fr30-elf                                          2204     1112 :   50.45%
> frv-linux-gnu                                    13541     5950 :   43.94%
> h8300-elf                                        19584     9327 :   47.63%
> hppa64-hp-hpux11.23                              18299     8549 :   46.72%
> ia64-linux-gnu                                   37629    17101 :   45.45%
> iq2000-elf                                        2752     1609 :   58.47%
> lm32-elf                                          1536      872 :   56.77%
> m32c-elf                                         10040     4145 :   41.28%
> m32r-elf                                          4436     2307 :   52.01%
> m68k-linux-gnu                                   15739     8147 :   51.76%
> mcore-elf                                         4816     2577 :   53.51%
> mep-elf                                          67335    15929 :   23.66%
> microblaze-elf                                    2656     1587 :   59.75%
> mips-linux-gnu                                   54543    24186 :   44.34%
> mmix                                              2597     1487 :   57.26%
> mn10300-elf                                       6384     3294 :   51.60%
> moxie-elf                                         1311      659 :   50.27%
> msp430-elf                                        6054     2382 :   39.35%
> nds32le-elf                                       5953     3152 :   52.95%
> nios2-linux-gnu                                   3735     2143 :   57.38%
> pdp11                                             2137     1157 :   54.14%
> powerpc-eabispe                                 109322    40582 :   37.12%
> powerpc-linux-gnu                                82976    32192 :   38.80%
> rl78-elf                                          5321     2432 :   45.71%
> rx-elf                                           14454     7534 :   52.12%
> s390-linux-gnu                                   48487    20454 :   42.18%
> sh-linux-gnu                                    104087    41820 :   40.18%
> sparc-linux-gnu                                  21912    10509 :   47.96%
> spu-elf                                          19937     8182 :   41.04%
> tilegx-elf                                       15412     6970 :   45.22%
> tilepro-elf                                      11998     5479 :   45.67%
> v850-elf                                          8725     4438 :   50.87%
> vax-netbsdelf                                     4537     2410 :   53.12%
> visium-elf                                       15190     7224 :   47.56%
> x86_64-darwin                                   346396   119593 :   34.52%
> xstormy16-elf                                     4660     2229 :   47.83%
> xtensa-elf                                        2799     1514 :   54.09%
>
> Here's the loadable size of insn-recog.o in an --enable-checking=release
> build on an x86_64-linux-gnu box:
>
> aarch64-linux-gnueabi                           443955   298026 :   67.13%
> alpha-linux-gnu                                  97194    80893 :   83.23%
> arm-linux-gnueabi                               782325   632248 :   80.82%
> avr-rtems                                       226827   159763 :   70.43%
> bfin-elf                                         52563    42376 :   80.62%
> c6x-elf                                         112512    90142 :   80.12%
> cr16-elf                                         10446    10006 :   95.79%
> cris-elf                                         74771    52634 :   70.39%
> epiphany-elf                                     87577    52284 :   59.70%
> fr30-elf                                          8041     7713 :   95.92%
> frv-linux-gnu                                    53699    47543 :   88.54%
> h8300-elf                                        70708    66274 :   93.73%
> hppa64-hp-hpux11.23                              71597    57484 :   80.29%
> ia64-linux-gnu                                  147286   130632 :   88.69%
> iq2000-elf                                       11002    11774 :  107.02%
> lm32-elf                                          5894     5798 :   98.37%
> m32c-elf                                         36563    28855 :   78.92%
> m32r-elf                                         17252    16910 :   98.02%
> m68k-linux-gnu                                   58248    59781 :  102.63%
> mcore-elf                                        17708    18948 :  107.00%
> mep-elf                                         314466   150771 :   47.95%
> microblaze-elf                                   10257    10534 :  102.70%
> mips-linux-gnu                                  230991   191155 :   82.75%
> mmix                                             10782    10678 :   99.04%
> mn10300-elf                                      24035    22802 :   94.87%
> moxie-elf                                         4622     4198 :   90.83%
> msp430-elf                                       21707    16539 :   76.19%
> nds32le-elf                                      22041    19444 :   88.22%
> nios2-linux-gnu                                  15595    13238 :   84.89%
> pdp11                                             7630     8254 :  108.18%
> powerpc-eabispe                                 430816   308481 :   71.60%
> powerpc-linux-gnu                               317738   248534 :   78.22%
> rl78-elf                                         18904    16329 :   86.38%
> rx-elf                                           55015    56632 :  102.94%
> s390-linux-gnu                                  190584   148961 :   78.16%
> sh-linux-gnu                                    408446   307927 :   75.39%
> sparc-linux-gnu                                  91016    80640 :   88.60%
> spu-elf                                          80387    69151 :   86.02%
> tilegx-elf                                       63815    49977 :   78.32%
> tilepro-elf                                      51763    39252 :   75.83%
> v850-elf                                         32812    28462 :   86.74%
> vax-netbsdelf                                    18350    18259 :   99.50%
> visium-elf                                       56872    46790 :   82.27%
> x86_64-darwin                                  1306182   883169 :   67.61%
> xstormy16-elf                                    17044    14430 :   84.66%
> xtensa-elf                                       10780     9678 :   89.78%
>
> The same for --enable-checking=yes,rtl:
>
> aarch64-linux-gnueabi                          1790272   507488 :   28.35%
> alpha-linux-gnu                                 440058   193826 :   44.05%
> arm-linux-gnueabi                              2845568  1299337 :   45.66%
> avr-rtems                                       885672   294387 :   33.24%
> bfin-elf                                        280606   142836 :   50.90%
> c6x-elf                                         486345   259256 :   53.31%
> cr16-elf                                         46626    20044 :   42.99%
> cris-elf                                        426813   144414 :   33.84%
> epiphany-elf                                    353078   122166 :   34.60%
> fr30-elf                                         40414    21042 :   52.07%
> frv-linux-gnu                                   259550   111335 :   42.90%
> h8300-elf                                       355199   158411 :   44.60%
> hppa64-hp-hpux11.23                             340584   149009 :   43.75%
> ia64-linux-gnu                                  661364   293710 :   44.41%
> iq2000-elf                                       41123    26709 :   64.95%
> lm32-elf                                         20370    14781 :   72.56%
> m32c-elf                                        174344    62000 :   35.56%
> m32r-elf                                         74357    41773 :   56.18%
> m68k-linux-gnu                                  275733   117445 :   42.59%
> mcore-elf                                        85180    48018 :   56.37%
> mep-elf                                        1450168   376020 :   25.93%
> microblaze-elf                                   44189    26295 :   59.51%
> mips-linux-gnu                                  876650   375753 :   42.86%
> mmix                                             49882    25363 :   50.85%
> mn10300-elf                                     128148    66768 :   52.10%
> moxie-elf                                        23388     9011 :   38.53%
> msp430-elf                                      114200    34426 :   30.15%
> nds32le-elf                                     101416    73677 :   72.65%
> nios2-linux-gnu                                  58799    29825 :   50.72%
> pdp11                                            32836    18557 :   56.51%
> powerpc-eabispe                                1976098   626942 :   31.73%
> powerpc-linux-gnu                              1510652   526841 :   34.88%
> rl78-elf                                         93675    40538 :   43.28%
> rx-elf                                          279748   137284 :   49.07%
> s390-linux-gnu                                  857009   316494 :   36.93%
> sh-linux-gnu                                   2154337   806571 :   37.44%
> sparc-linux-gnu                                 367682   169019 :   45.97%
> spu-elf                                         341945   135629 :   39.66%
> tilegx-elf                                      235480   112103 :   47.61%
> tilepro-elf                                     246231   104137 :   42.29%
> v850-elf                                        158028    72875 :   46.12%
> vax-netbsdelf                                    85057    37578 :   44.18%
> visium-elf                                      257148   103331 :   40.18%
> x86_64-darwin                                  5514235  1721777 :   31.22%
> xstormy16-elf                                    83456    46128 :   55.27%
> xtensa-elf                                       52652    29880 :   56.75%
>
> Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-none-eabi.
> Also tested by building the testsuite for each of the targets above
> and making sure there were no assembly differences.  Made sure that no
> objects in spec2k6 changed for aarch64-linux-gnu (except for perlbench
> perl.o and cactusADM datestamp.o, where the differences are timestamps).
> OK to install?
>
> Thanks,
> Richard
>
> PS. I've attached the new genrecog.c since the diff version is unreadable.
>
>
> gcc/
>         * Makefile.in (build/genrecog.o): Depend on inchash.h.
>         (build/genrecog$(build_exeext): Depend on build/hash-table.o and
>         build/inchash.o
>         * genrecog.c: Rewrite most of the code except for the third page.
>
> Index: gcc/Makefile.in
> ===================================================================
> --- gcc/Makefile.in     2015-04-27 10:42:57.783191573 +0100
> +++ gcc/Makefile.in     2015-04-27 10:43:42.878643078 +0100
> @@ -2527,7 +2527,8 @@ build/genpeep.o : genpeep.c $(RTL_BASE_H
>  build/genpreds.o : genpreds.c $(RTL_BASE_H) $(BCONFIG_H) $(SYSTEM_H)   \
>    coretypes.h $(GTM_H) errors.h $(READ_MD_H) gensupport.h $(OBSTACK_H)
>  build/genrecog.o : genrecog.c $(RTL_BASE_H) $(BCONFIG_H) $(SYSTEM_H)   \
> -  coretypes.h $(GTM_H) errors.h $(READ_MD_H) gensupport.h
> +  coretypes.h $(GTM_H) errors.h $(READ_MD_H) gensupport.h              \
> +  $(HASH_TABLE_H) inchash.h
>  build/genhooks.o : genhooks.c $(TARGET_DEF) $(C_TARGET_DEF)            \
>    $(COMMON_TARGET_DEF) $(BCONFIG_H) $(SYSTEM_H) errors.h
>  build/genmddump.o : genmddump.c $(RTL_BASE_H) $(BCONFIG_H) $(SYSTEM_H) \
> @@ -2559,6 +2560,8 @@ genprog = $(genprogerr) check checksum c
>  # These programs need libs over and above what they get from the above list.
>  build/genautomata$(build_exeext) : BUILD_LIBS += -lm
>
> +build/genrecog$(build_exeext) : build/hash-table.o build/inchash.o
> +
>  # For stage1 and when cross-compiling use the build libcpp which is
>  # built with NLS disabled.  For stage2+ use the host library and
>  # its dependencies.
>
Hi Richard,
I noticed that this patch caused ICE for gcc.target/arm/mmx-2.c on
arm-none-linux-gnueabi.  Could you please have a look at it?

The log message is as below,
/projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c: In function 'foo':
/projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:166:1:
error: unrecognizable insn:
(insn 541 540 542 2 (set (reg:V4HI 512 [ D.4809 ])
        (vec_merge:V4HI (vec_select:V4HI (reg:V4HI 510 [ D.4806 ])
                (parallel [
                        (const_int 2 [0x2])
                        (const_int 0 [0])
                        (const_int 3 [0x3])
                        (const_int 1 [0x1])
                    ]))
            (vec_select:V4HI (reg:V4HI 511 [ D.4806 ])
                (parallel [
                        (const_int 0 [0])
                        (const_int 2 [0x2])
                        (const_int 1 [0x1])
                        (const_int 3 [0x3])
                    ]))
            (const_int 5 [0x5])))
/projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:159 -1
     (nil))
/projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:166:1:
internal compiler error: in extract_insn, at recog.c:2341
0xa42d2a _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
    /projects/.../src/gcc/gcc/rtl-error.c:110
0xa42d59 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
    /projects/.../src/gcc/gcc/rtl-error.c:118
0xa15ff7 extract_insn(rtx_insn*)
    /projects/.../src/gcc/gcc/recog.c:2341
0x7ffb42 instantiate_virtual_regs_in_insn
    /projects/.../src/gcc/gcc/function.c:1598
0x7ffb42 instantiate_virtual_regs
    /projects/.../src/gcc/gcc/function.c:1966
0x7ffb42 execute
    /projects/.../src/gcc/gcc/function.c:2015
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

GCC is configured with

gcc/configure --target=arm-none-linux-gnueabi --prefix=
--with-sysroot=... --enable-shared --disable-libsanitizer
--disable-libssp --disable-libmudflap
--with-plugin-ld=arm-none-linux-gnueabi-ld --enable-checking=yes
--enable-languages=c,c++,fortran --with-gmp=... --with-mpfr=...
--with-mpc=... --with-isl=... --with-cloog=... --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a

Thanks,
bin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]