[Bug target/86693] inefficient atomic_fetch_xor
nruslan_devel at yahoo dot com
Sun Jul 29 00:36:00 GMT 2018
--- Comment #3 from Ruslan Nikolaev <nruslan_devel at yahoo dot com> ---
(In reply to Jakub Jelinek from comment #1)
> The reason why this works for sub/add is that x86 has xadd instruction, so
> we expand it as xadd and later on during combine find out we are actually
> comparing the result of lock; xadd with something we can optimize better and
> do the optimization.
> For __atomic_fetch_xor (ptr, x, y) == x (or != x), or __atomic_xor_fetch
> (ptr, x, y) == 0 (or != 0), or __atomic_or_fetch (ptr, x, y) == 0 (or != 0),
> we'd need to handle this specially already at expansion time, so with extra
> special optabs, because there is no instruction that keeps the old or new
> value of xor or ior in a register, and once we emit a compare and exchange
> loop, it is very hard to optimize that to something different.
btw, do not know exactly how gcc handles it... Is it possible to emit an
artificial 'xxor' instruction which acts like xadd? Then during optimization,
xxor can be replaced to xor or to cmpxchg-loop depending on the circumstances.
More information about the Gcc-bugs