This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 2/2] asm inline
On Sun, 2 Dec 2018, Segher Boessenkool wrote:
what is the point of !!count when we take the max with 1 on the very
next line? Is it in prevision of a time when we may remove the MAX? (sorry
if this was covered in previous iterations)
By the way, not related to the patch, but I wonder why we cannot have a
cost of 0.
That exactly is the point :-) My code still works if you remove that MAX
expression, as hopefully we will some day. Right now GCC will of course
optimise it to "count = 1;", but writing it like that doesn't make the
I think this workaround is here because otherwise we get infinite recursion
in the inliner, but that of course should be fixed, not worked around.
Ideally. If anyone ever has time for it. :-)
Note that I may have 2 or 3 such asm per floating point operation, which
could be enough to skew inlining decisions. On the other hand, the
protected operations can never be optimized (that's the whole point of
the asm), which is a reason not to inline too much. I never really had a
problem, I was just curious.
My main use of inline asm is as an optimization barrier:
possibly marked volatile to prevent more optimizations. I certainly
expect it to generate exactly 0 instruction in most cases. Although if I
am not careful it could easily generate moves from x87 to sse/memory for
instance. I guess a minimal cost is safer and doesn't affect decisions too
This is only for inlining; GCC _does_ know such asms are cost 0, and uses
that for all other purposes.
("gx", btw? Is that a typo? Or, on what target is "x" useful here?)
(context: -frounding-math doesn't work, so I have to protect double values)
On x64_64, "x" is for SSE registers, and those are not included in "g".
When "gx" fails (ICE on old gcc, bad codegen with llvm) I use "mx" ("gx"
does not really bring much compared to "mx" for a double) or even just "x"
since "m" doesn't quite work as I would like. I have this rather sad code
(slightly edited) where no 2 platforms use the same letter:
# if defined __SSE2_MATH__
asm volatile ("" : "+gx"(x) );
# elif (defined __i386__ || defined __x86_64__)
asm volatile ("" : "+mt"(x) );
# elif (defined __VFP_FP__ && !defined __SOFTFP__) || defined __aarch64__
asm volatile ("" : "+gw"(x) );
# elif defined __powerpc__ || defined __POWERPC__
asm volatile ("" : "+gd"(x) );
# elif defined __sparc
asm volatile ("" : "+ge"(x) );
# elif defined __ia64
asm volatile ("" : "+gf"(x) );
asm volatile ("" : "+g"(x) );