This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
This is a new pass that tries to replace structure references with scalars so that they can be exposed to the scalar optimizers. I believe that Muchnik or Morgan explain it in more detail. The pass should address several PRs, including 12747, 12853, 12825, 6883, 6880 and 7061. The patch still doesn't fix *all* of them because some other cleanups are necessary that would've obfuscated this patch. As an example, given: bar () { struct Complex_i * factorpointer; struct Complex_i factor; factor.re = 123;; factor.im = 428;; Zi = factor;; } SRA converts the above into: bar () { int SR.2; int SR.1; struct Complex_i factor; struct Complex_i * factorpointer; SR.1 = 123; SR.2 = 428; Zi.re = SR.1; Zi.im = SR.2; } So, now the scalar optimizers can propagate '123' and '428' into 'Zi' and also completely remove references to 'factor'. For PR 12747, we are still missing a few scalarizations because there is a backend quirk related to removing the TREE_ADDRESSABLE bit from structures which blocks SRA from scalarizing some structures. In this case, we have this input code: void copy (BitVector & DEST, BitVector & SRC, unsigned I) { DEST[I] = SRC[I]; } which after scalarization is optimized into void copy(BitVector&, BitVector&, unsigned int) (DEST, SRC, I) { <D1565>.m_bv = SRC; <D1565>.m_idx = I; <D1561>.m_bv = <D1565>.m_bv; <D1561>.m_idx = I; <D1572>.m_bv = DEST; <D1572>.m_idx = I; <D1560>.m_bv = <D1572>.m_bv; <D1560>.m_idx = I; T.3 = getBit (<D1561>.m_bv, I); <D1587> = (int)T.3; setBit (<D1560>.m_bv, <D1560>.m_idx, (int)(bool)<D1587>); } After I get TREE_ADDRESSABLE fixed for the remaining structures, we will get: void copy(BitVector&, BitVector&, unsigned int) (DEST, SRC, I) { T.3 = getBit (SRC, I); <D1587> = (int)T.3; setBit (DEST, I, (int)(bool)<D1587>); } which is pretty good. I've done some timings and SRA adds about 0.13 seconds to the cc1/cc1plus components. In SPEC it's either neutral or it improves things a little: No SRA SRA 164.gzip 676 690 +2% 175.vpr 415 414 -0.24% 181.mcf 412 412 0% 186.crafty 614 615 0% 197.parser 553 560 +1.2% 253.perlbmk 787 811 +3% 254.gap 726 726 0% 255.vortex 856 855 -0.1% 256.bzip2 529 531 0% 300.twolf 532 531 0% Est. SPECint_base2000 593 Est. SPECint2000 596 I expect SRA to reduce virtual operands somewhat, but it creates more scalar assignments, so it should balance out. Something tells me register pressure may be a problem, but I haven't noticed anything too nasty. In any case, this is not the complete fix, there are structures that could be scalarized that currently aren't. I'd like folks interested in this to check it out with their favourite programs and let me know how things go. Bootstrapped and tested on alpha, x86, amd64 and ia64. Diego.
Attachment:
20031120-sra.diff.gz
Description: GNU Zip compressed data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |