This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [cxx-mem-model] compare_exchange implementation


On 10/19/2011 01:55 PM, Richard Henderson wrote:
On 10/19/2011 09:26 AM, Andrew MacLeod wrote:
well, they are explicitly    'compare_exchange_strong'  and 'compare_exchange_weak' calls...   so yes, they have 'hardcoded' one or the other :-)
we could alternatively do 2 separate builtins, but I didn't see the point.  But Im ambivalent.
I'm just not keen on issuing errors this late.  If we 'know' it's always going to
be 0/1 because of other things the compiler does, we might as well simply do

weak = (opN == const1_rtx)

and leave it at that.

We issue all the memory model errors at this same location . A user *could* use this builtin and write

bool weak;
<...>
__atomic_compare_exchange (&mem, &expected, desired, weak, m1, m2);

The c++ wrappers will always call it with 0 or 1, but direct callers could do other things. The optimizers may well decide
that its a 1 or 0 by the time we get to expansion. Its documented as requiring a compile time constant.


We either report errors, or remove the possibility by making them separate routines. Or silently make it strong.
I suppose we could also check the value when we turn it from __atomic_compare_exchange to __atomic_compare_exchange_{1,2,4,8,16}... thats before going into SSA. Im not sure it really buys a lot tho.


Maybe they should just be separate routines...

I clearly misunderstood what you said then, I thought you said use
QImode :-P.   So, look at target and use that mode?  target may not
be set sometimes tho right? what do you use then?  QImode?
You never look at target, you look at

insn_data[icode].operand[opno].mode

if target is set and doesn't match that mode, then create a new pseudo
of the proper mode and let your caller do the conversion.

ah.

Worst case, if the native CAS is relaxed, the fences need to be
issued as such to implement

compare_exchange (T* ptr, T* old, new, success, failure)

     memory_fence (success)         // release modes
     val = *old;
     res = relaxed_CAS (ptr, val, new)
     if (success)
       {
         memory_fence (success);   // acquire modes
         return true;
       }
     *old = res;
     memory_fence (failure)
     return false;
If the target uses LL/SC, then native CAS is likely relaxed.  However,
we will already be emitting a number of jumps in order to implement
the LL/SC sequence and stuffing the fences into those jumps is going
to be much better than you trying to do it in the middle-end outside
of that sequence.  Really.


My point is, IF the fence has to go BEFORE 'val = *old' and AFTER '*old = res', its impossible to do it within the CAS pattern, unless you also have it do the copies...


If, which is more likely as I think about it, thats not the case and the fences can look like they do in the second example (closer to the CAS than the copies) , then the CAS can take care of both fences, which would be much more preferable and clean..
So if the CAS can handle it all, why does it matter if the pattern
has a single "compressed" parameter for the 3 values, or whether it
simply explicitly lists all three?
*shrug* Just for less memory consumption. It *could* remain 3 values.

Does it really make a measurable memory difference? I can certainly jam them together...

Andrew


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]