This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: surprising optimization of comparison operations for __int128_t


On 07/09/2010 09:11 AM, Mathieu Lacage wrote:

> The attached C++ testcase compares the performance behavior of
> __int128_t used directly vs __int128_t used through an overloaded
> operator <. The overloaded < operator appears faster than the raw
> __int128_t which I find really surprising so, I fear I am not
> measuring what I think I am measuring. Hints ?
> 
> [mathieu@mathieu-laptop benchmark-time]$ g++ --version
> g++ (GCC) 4.4.3 20100127 (Red Hat 4.4.3-4)
> [mathieu@mathieu-laptop benchmark-time]$ g++ -O3 test.cc
> # run raw __int128_t version
> [mathieu@mathieu-laptop benchmark-time]$ time -p ./a.out 100000002 a
> 16384
> 2
> real 0.60
> user 0.60
> sys 0.00
> # run operator < version
> [mathieu@mathieu-laptop benchmark-time]$ time -p ./a.out 100000002 test
> 16384
> 2
> real 0.40
> user 0.40
> sys 0.00

g++ seems to be generating a specialization of run_cmp() in the
__int128_t case, with the parameters a and b fixed at a=1 and b=2, in
an attempt to do some constant propagation.  This ought to help, but
unfortunately the back-end generates worse code for the specialized
case.

This isn't uncommon in optimizing compilers: you do something that
usually improves code quality, but occasionally makes things worse.
If you compile with -fdump-tree-optimized you'll see what is
happening.

Andrew.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]