This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: Call for compiler help/advice: atomic builtins for v3
- From: Paolo Carlini <pcarlini at suse dot de>
- To: Mark Mitchell <mark at codesourcery dot com>
- Cc: gcc at gcc dot gnu dot org, libstdc++ at gcc dot gnu dot org, rth at redhat dot com,Ian Lance Taylor <ian at airs dot com>
- Date: Sun, 06 Nov 2005 19:56:04 +0100
- Subject: Re: Call for compiler help/advice: atomic builtins for v3
- References: <436DDC36.8070308@suse.de> <436E4DF0.3070004@codesourcery.com>
Hi Mark,
>I think this is a somewhat difficult problem because of the tension
>between performance and functionality. In particular, as you say, the
>code sequence you want to use varies by CPU.
>
>I don't think I have good answers; this email is just me musing out loud.
>
>You probably don't want to inline the assembly code equivalent of:
>
> if (cpu == i386) ...
> else if (cpu == i486) ...
> else if (cpu == i586) ...
> ...
>
>On the other hand, if you inline, say, the i486 variant, and then run on
>a i686, you may not get very good performance.
>
>So, the important thing is to weigh the cost of a function call plus
>run-time conditionals (when using a libgcc routine that would contain
>support for all the CPUs) against the benefit of getting the fastest
>code sequences on the current processors.
>
>
Actually, the situation is not as bad, as far as I can see: the worst
case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
targer either cannot implement the builtin at all (a trivial fall back
using locks or no MT support at all) or can in no more than 1
non-trivial way. Then libgcc would contain at most 2 versions: the
trivial one, and another piece of assembly, absolutely identical in
principle to what the builtin is expanded too in case the inline version
is actually desired.
>And in a workstation distribution you may be concerned about supporting
>multiple CPUs; if you're building for a specific hardware board, then
>you only care about the CPU actually on that board.
>
>What do you propose that the libgcc routine do for a CPU that cannot
>support the builtin at all? Just do a trivial implementation that is
>safe only for a single-CPU, single-threaded system?
>
>
Either that or a very low performance one, using locks. The issue it's
still open, we can resolve it rather easily, I think.
>I think that to satisfy everyone, you may need a configure option to
>decide between inlining support for a particular processor (for maximum
>performance when you know the target performance) and making a library
>call (when you don't).
>
>
Yes, let's consider for simplicity the obnoxious i686: if the user
doesn't passes any -march then the fallback using locks is picked from
libgcc or the non-trivial implementation if the specific target (i486+)
supports it; if the user passes -march=i486+ then the builtin is
expanded inline by the compiler, no use of libgcc at all. Similarly for
Sparc.
Paolo.