This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: -mcx16 vs. not using CAS for atomic loads
- From: Torvald Riegel <triegel at redhat dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: GCC <gcc at gcc dot gnu dot org>, Bin Fan <bin dot x dot fan at oracle dot com>
- Date: Tue, 24 Jan 2017 10:08:52 +0100
- Subject: Re: -mcx16 vs. not using CAS for atomic loads
- Authentication-results: sourceware.org; auth=none
- References: <1484850210.5606.587.camel@redhat.com> <fcfba108-0ab1-c481-3a30-201870f864f0@redhat.com>
On Fri, 2017-01-20 at 09:55 -0800, Richard Henderson wrote:
> On 01/19/2017 10:23 AM, Torvald Riegel wrote:
> > I think I prefer Option 3b as the short-term solution. It does not
> > break programs (except the __atomic_always_lock_free assertion scenario,
> > but that's likely to not work anyway given that the atomics will be
> > lock-free but not "fast"). It makes programs aware that the atomics
> > will not be fast when they are not fast indeed (ie, when getting loads
> > through cmpxchg).
>
> I agree. Let's go through the library for the loads, giving us a hook to fix
> this in the future.
I'm working on a patch for this.
> > I'm worried that Option 4 would not be possible until some time in the
> > future when we have actually gotten confirmation from the HW vendors
> > about 16-byte atomic loads. The additional risk is that we may never
> > get such a confirmation (eg, because they do not want to constrain
> > future HW), or that this actually holds just for a few processors.
>
> Indeed, I don't think we'll get any proper confirmation from the hw vendors any
> time soon. Or possibly ever.
>
> The only light on the horizon that I can see is that HTM is now working in
> newly shipping Intel processors, and we could write a pure load path through
> libatomic that uses that. Over time the lack of guaranteed SSE atomicity
> becomes less relevant.
Unless HW transactions are guaranteed to succeed for scenarios that are
sufficient for the atomics, HTM won't help because we'd have to consider
the worst-case, which would mean some non-HTM fallback.
Intel's current HTM does not make guarantees; IIRC, either Power or s390
have an HTM mode in which there are guarantees, provided that the user
follows a few rules.