See <https://lore.kernel.org/linux-toolchains/20210604182357.GA1688170@rowland.harvard.edu/T/#mb0e293105ae45974af7ed435847e2a6f72158a0a> .
Obviously "just for writes" is nonsense, but what's missing is the ability to add a (use "memory"), which maybe just means allowing to specify "memory" as use? That would prevent moving stores across such asm() but it would not cause "pointless reloading of globals".
Sure :-) But syntactically it probably is best put amongst the clobbers, all code parsing that already knows about handling various special cases of syntax (well, just "memory" and "cc", and the various ways of naming registers).
For the compiler-barrier use-cases, you can use the __atomic_signal_fence builtin instead of an empty inline-asm statement. E.g. if you want to make sure that all writes apperaing after a barrier are actually emitted after all reads/writes appearing before the barrier, use __atomic_signal_fence(__ATOMIC_RELEASE).