Atomic accesses on ARM microcontrollers
David Brown
david.brown@hesbynett.no
Sun Oct 11 12:16:07 GMT 2020
On 10/10/2020 22:05, Toby Douglass wrote:
> On 10/10/2020 21:43, David Brown wrote:
>> On 09/10/2020 23:35, Toby Douglass wrote:
>>> On 09/10/2020 20:28, David Brown wrote:
>
>>> I would like - but cannot - reply to the list, as their email server
>>> does not handle encrypted email.
>>
>> I've put the help list on the cc to my reply - I assume that's okay for
>> you.
>
> Yes.
>
>> (Your email to me was not encrypted, unless I am missing something.)
>
> I mean TLS for SMTP, as opposed to say PGP.
>
Ah, you have your own mail server that sends directly to the receiving
server? I always set up my mail servers to send via my ISP's server (a
"smarthost" in Debian setup terms). That makes this kind of thing an SEP.
>>>> I work primarily with microcontrollers, with 32-bit ARM Cortex-M
>>>> devices
>>>> being the most common these days. I've been trying out atomics in gcc,
>>>> and I find it badly lacking.
>>>
>>> The 4.1.2 atomics or the later, replacement API?
>>
>> I am not sure what you mean here, or what "4.1.2" refers to - it doesn't
>> match either the gcc manual or the C standards as far as I can see.
>
> GCC introduced its first API for atomics in version 4.1.2, these guys;
>
Jonathan Wakely explained the reference. I've read the manuals for a
/lot/ of gcc versions over the years, but I don't have all the details
in my head!
> https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
>
> Then in a later version, which I can't remember offhand, a second and
> much evolved version of the API was introduced.
>
Yes.
>> However, "atomic" also has a simpler, more fundamental and clearer
>> meaning with a wider applicability - it means an operation that cannot
>> be divided (or at least, cannot be /observed/ to be divided). This is
>> the meaning that is important to me here.
>
> Ah and you mentioned atomically writing larger objects, so we're past
> just caring about say word tearing.
>
Yes. Sizes up to 4 bytes can be accessed atomically on this processor
using "normal" operations, and 8 byte accesses are atomic if specific
instructions are used. (gcc generates these for non-atomic accesses.)
I am hoping to be able to put together a solution for using standard
C11/C++11 atomic types of any size and know that these actually work
correctly. It is not essential - I can make my own types, functions,
etc., and use them as needed. But it would be nice and convenient to be
able to use the standard types and functions.
>From Jonathan's replies, it seems I can simply make my own libatomic
implementations and use them.
>> What it means is that if thread A stores a value in the
>> atomic variable ax, and thread B attempts to read the value in ax, then
>> B will read either the entire old value before the write, or the entire
>> new value after the write - it will never read an inconsistent partial
>> write.
>
> I could be wrong, but I think the only way you can do this with atomics
> is copy-on-write. Make a new copy of the data, and use an atomic to
> flip a pointer, so the readers move atomically from the old version to
> the new version.
I've been thinking a bit more about this, inspired by your post here.
And I believe you are correct - neither ldrex/strex nor load/store
double register is sufficient for 64-bit atomic accesses on the 32-bit
ARM, even for plain reads and writes. That's annoying - I had thought
the double register read/writes were enough. But if the store double
register is interruptible with a restart (and I can't find official
documentation on the matter for the Cortex-M7), then an interrupted
store could lead to an inconsistent read by the interrupting code.
I guess I am back to the good old "disable interrupts" solution so
popular in the microcontroller world. That always works.
>
>>>> These microcontrollers are all single core, so memory ordering does not
>>>> matter.
>>>
>>> I am not sure this is true. A single thread must make the world appear
>>> as if events occur in the order specified in the source code, but I bet
>>> you this already not true for interrupts.
>>
>> It is true even for interrupts.
>
> [snip]
>
> Thankyou for the insights. I've done hardly any bare-metal work, so I'm
> not familiar with the actual practicalities of interrupts and their
> effect in these matters.
>
>>> It may be for example they can be
>>> re-ordered with regard to each other, and this is not being prevented.
>>
>> Do you mean the kind of re-ordering the compiler does for code?
>
> I was thinking here of the processor.
>
>> That is
>> not in question here - at least, not to me. I know what kinds of
>> reorders are done, and how to prevent them if necessary. (On a single
>> core, "volatile" is all you need - though there are more efficient ways.
>
> I'm not sure about that. I'd need to revisit the subject though to
> rebuild my knowledge, so I can't make any assertion here - only that I
> know I don't know one way or the other.
>
One thing we can all be sure about - this stuff is difficult, it needs a
/lot/ of thought, and the documentation is often poor on the critical
details.
>> And while the cpu and memory system can include write store buffers,
>> caches, etc., that can affect the order of data hitting the memory,
>> these are not an issue in a single core system. (They /are/ important
>> for multi-core systems.)
>
> Yes, I think so too, but to be clear we mean single physical and single
> logical core; no hyperthreading.
>
Yes, absolutely.
>>> Also, I still don't quite think there *are* atomic loads/stores as such
>>> - although having said that I'm now remembering the LOCK prefix on
>>> Intel, which might be usable with a load. That would then lock the
>>> cache line and load - but, ah yes, it doesn't *mean* anything to
>>> atomically load. The very next micro-second you value could be replaced
>>> a new write.
>>
>> Replacing values is not an issue. The important part is the atomicity
>> of the action. When thread A reads variable ax, it doesn't matter if
>> thread B (or an interrupt, or whatever) has changed ax just before the
>> read, or just after the read - it matters that it cannot change it
>> /during/ the read. The key is /consistent/ values, not most up-to-date
>> values.
>
> Yes. I can see this from your earlier explanation regarding what you're
> looking for with atomic writes.
>
>> I had a look through the github sources, but could not find anything
>> relevant. But obviously that library has a lot more code and features
>> than I am looking for.
>
> I was only thinking of a single header file which contains the atomics
> for ARM32. However, it's not useful to you for what you're looking for
> with atomic writes.
>
Thank you anyway - and thank you for making me think a little more,
correcting a mistake I made!
More information about the Gcc-help
mailing list