This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: std::max/min optimization


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nathan Myers wrote:
> I have been experimenting with code sequences for computing max() and 
> min() on integer types.  What I've found is that an implementation like 
> 
>   int max(int a, int b) 
>     { int i = -(a > b); return (a & i)|(b & ~i); }
> 
> is about three times faster, on P3 (and probably moreso on P4) than
> the naive implementation as found in our library:

Your argumentation is flawed.  You compare apples and oranges.

You argue that because of the architecture of the i686+ processors plain
old i386 code is slow and your convoluted code is better.  But this is
not the comparison you must make.  You must compare it with the code
generated for these processors.  Exactly because mispredicted branches
(and max has a 50% misprediction rate in general) is bad the designers
added support to avoid them: conditional instructions.  More concrete:
conditional moves.  These instructions are used automatically if you
tell gcc to use them which is only  the case if you do *not* generate
plain old i386 code.  Add the -march=pentium4 option of whatever is
adequate.

What you could have said is that your code is a compromise.  It
generating sufficiently blended code to perform acceptable regardless of
the compiler options.  But it's certainly not optimal.  The plain version

  int max(int a, int b)
    { return a > b ? a : b; }

is about 15%+ faster on a P4 than your code (my entire test program,
including all the overhead, runs 10%+ faster with the simple code).

For this reason I would strongly recommend to not make this change.  If
somebody wants optimally performing code s/he should be able to get it.
 This is not possible with the blended code.  And if the appropriate
compiler options are not used, there is obviously no interest in the
best performance.

- -- 
â Ulrich Drepper â Red Hat, Inc. â 444 Castro St â Mountain View, CA â
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/w++b2ijCOnn/RHQRAhprAKC8x+U6fAQ7Pu3EBImgDbgJHUJWDwCfX1HB
zjJpuL6ZZ0ohXafBVyGFP14=
=VTWH
-----END PGP SIGNATURE-----


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]