This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
g++ optimizer and numerical computations
- To: Marc Espie <Marc dot Espie at liafa dot jussieu dot fr>
- Subject: g++ optimizer and numerical computations
- From: Gabriel Dos Reis <Gabriel dot Dos-Reis at dptmaths dot ens-cachan dot fr>
- Date: Tue, 17 Feb 1998 02:37:49 +0100 (MET)
- Cc: egcs at cygnus dot com
- References: <19980216180601.14502@liafa1.liafa.jussieu.fr>
>>>>> «Marc», Marc Espie <Marc.Espie@liafa.jussieu.fr> wrote:
Marc> I'm trying to design a refcnted vector type for numerical computations,
Marc> and benching that on the 980205 egcs snapshot with haifa-enabled on an
Marc> alpha. If I'm not mistaken in my analysis, the results make me less
Marc> than happy.
[code deleted]
Marc> The SIZE (0) insures the compiler won't just unroll the whole loop.
EGCS won't unroll loops if you do not pass -funroll-loops option.
Marc> The two interesting points are labelled (1) and (2).
Marc> (1) is the vector copy, which just Does a cnt++ on the InVector value.
Marc> (2) implements copy on write semantics: w[i] is used as an lvalue, hence
Marc> the int &Vector::operator [](size_t i) must be selected, and if cnt > 1,
Marc> the copy does occur.
Marc> All operators are defined inline to make it possible for the compiler to
Marc> notice that modifications to cnt don't occur elsewhere.
Marc> With that code, g++ -O9 spouts the following assembler output, where
Marc> the code we want occurs between $L515 and $L516.
Marc> - the refcnt test occurs inside the loop. It has not been moved outside
Marc> the loop by the compiler.
I compiled your code with the SunPro Workshop compiler on a
SPARCStation-20 with -O4. The output was worse than EGCS'. A call to
Vector::operator[] () results to an effective function call where the
refcnt is tested, and this inside the loop.
Marc> - the exceptional case (refcnt != 1) is inside the loop, and the normal
Marc> case (refcnt == 1) is coded as a forward conditional branch, predicted
Marc> to fall through (alpha architecture handbook).
Marc> Did I miss something, or is the optimization truly as bad as it looks ?
Not too bad as you're saying.
Marc> Any optimization options I missed ?
This problem is partly due to data access through pointer
which implies possible aliasing...
Marc> Any way to code the loop so that it will run faster ?
If you want to stick to reference counting then the best way is probably
to implement a reference semantic, not a value semantic as you are
doing since each Vector::operator[] () will results in refcnt
test. Then provide a function to make a hardcopy for true copy. This
approach is taken in blitz.
Another way is to take an expression template based approach. Take a
look at
http://www.cmla.ens-cachan.fr/~dosreis/C++/Papers/sci_computing.ps
to get an idea...
-- Gaby