This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC, ARM] later split of symbol_refs


On 27/06/12 15:58, Dmitry Melnik wrote:
> Hi,
> 
> We'd like to note about CodeSourcery's patch for ARM backend, from which 
> GCC mainline can gain 4% on SPEC2K INT: 
> http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch 
> (also the patch is attached).
> 
> Originally, we noticed that GNU Go works 6% faster on cortex-a8 with 
> -fno-gcse.  After profiling we found that this is most likely caused by 
> cache misses when accessing global variables.  GCC generates ldr 
> instructions for them, while this can be avoided by emitting movt/movw 
> pair for such cases. RTL expressions for these instructions is high_ and 
> lo_sum.  Currently, symbol_ref expands as high_ and lo_sum but then 
> cprop1 decides that this is redundant and merges them into one load insn.
> 
> The problem was also found by Linaro community: 
> https://bugs.launchpad.net/gcc-linaro/+bug/886124 .
> Also there is a patch from codesourcery (attached), which was ported to 
> linaro gcc 4.5, but is missing in later linaro releases.
> This patch makes split of symbol_refs at the later stage (after cprop), 
> instead of generating movt/movw at expand.
> 
> It fixed our test case on GNU Go.  Also we tested it on SPEC2K INT (ref) 
> with GCC 4.8 snapshot from May 12, 2012 on cortex-a9 with -O2 and -mthumb:
> 
>              Base      Base      Base      Peak      Peak      Peak
> Benchmarks  Ref Time  Run Time   Ratio    Ref Time  Run Time  Ratio
> ----------  --------  --------  --------  --------  -------- -------
> 164.gzip    1400      492       284     1400       497       282  -0.70%
> 175.vpr     1400      433       323     1400       458       306  -5.26%
> 176.gcc     1100      203       542     1100       198       557   2.77%
> 181.mcf     1800      529       340     1800       528       341   0.29%
> 186.crafty  1000      261       383     1000       256       391   2.09%
> 197.parser  1800      709       254     1800       701       257   1.18%
> 252.eon     1300      219       594     1300       202       644   8.42%
> 253.perlbmk 1800      389       463     1800       367       490   5.83%
> 254.gap     1100      259       425     1100       236       467   9.88%
> 255.vortex  1900      498       382     1900       442       430  12.57%
> 256.bzip2   1500      452       332     1500       424       354   6.63%
> 300.twolf   3000      916       328     3000       853       352   7.32%
> SPECint_base2000                376
> SPECint2000                                                  391   3.99%
> 
> 
> SPEC2K INT grows by 4% (up to 12.5% on vortex; vpr slowdown is likely 
> because of big variance on this test).
> 
> Similarly, there are gains of 3-4% without -mthumb on cortex-a9 and on 
> cortex-a8 (thumb2 and ARM modes).
> 
> This patch can be applied to current trunk and passes regtest 
> successfully on qemu-arm.
> Maybe it will be good to have it in trunk?
> If everybody agrees, we can take care of committing it.
> 
> --
> Best regards,
>    Dmitry
> 
> 
> gcc-4.5-linaro-r99369.patch
> 

Please update the ChangeLog entry (it's not appropriate to mention
Sourcery G++) and add a comment as Steven has suggested.

Otherwise OK.

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]