This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Call for compiler help/advice: atomic builtins for v3


Hi Mark,

>I think this is a somewhat difficult problem because of the tension
>between performance and functionality.  In particular, as you say, the
>code sequence you want to use varies by CPU.
>
>I don't think I have good answers; this email is just me musing out loud.
>
>You probably don't want to inline the assembly code equivalent of:
>
>  if (cpu == i386) ...
>  else if (cpu == i486) ...
>  else if (cpu == i586) ...
>  ...
>
>On the other hand, if you inline, say, the i486 variant, and then run on
>a i686, you may not get very good performance.
>
>So, the important thing is to weigh the cost of a function call plus
>run-time conditionals (when using a libgcc routine that would contain
>support for all the CPUs) against the benefit of getting the fastest
>code sequences on the current processors.
>  
>
Actually, the situation is not as bad, as far as I can see: the worst
case is i386 vs i486+, and Old-Sparc vs New-Sparc. More generally, a
targer either cannot implement the builtin at all (a trivial fall back
using locks or no MT support at all) or can in no more than 1
non-trivial way. Then libgcc would contain at most 2 versions: the
trivial one, and another piece of assembly, absolutely identical in
principle to what the builtin is expanded too in case the inline version
is actually desired.

>And in a workstation distribution you may be concerned about supporting
>multiple CPUs; if you're building for a specific hardware board, then
>you only care about the CPU actually on that board.
>
>What do you propose that the libgcc routine do for a CPU that cannot
>support the builtin at all?  Just do a trivial implementation that is
>safe only for a single-CPU, single-threaded system?
>  
>
Either that or a very low performance one, using locks. The issue it's
still open, we can resolve it rather easily, I think.

>I think that to satisfy everyone, you may need a configure option to
>decide between inlining support for a particular processor (for maximum
>performance when you know the target performance) and making a library
>call (when you don't).
>  
>
Yes, let's consider for simplicity the obnoxious i686: if the user
doesn't passes any -march then the fallback using locks is picked from
libgcc or the non-trivial implementation if the specific target (i486+)
supports it; if the user passes -march=i486+ then the builtin is
expanded inline by the compiler, no use of libgcc at all. Similarly for
Sparc.

Paolo.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]