This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Function attribute((optimize(...))) ignored on inline functions?
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Matt Turner <mattst88 at gmail dot com>
- Cc: GCC Development <gcc at gcc dot gnu dot org>
- Date: Fri, 31 Jul 2015 11:25:59 +0200
- Subject: Re: Function attribute((optimize(...))) ignored on inline functions?
- Authentication-results: sourceware.org; auth=none
- References: <CAEdQ38EdfzV-9=9ZJmUsWpmhDkg4P2dQVGxyBY+11L30h++c1w at mail dot gmail dot com>
On Thu, Jul 30, 2015 at 6:40 PM, Matt Turner <mattst88@gmail.com> wrote:
> I'd like to tell gcc that it's okay to inline functions (such as
> rintf(), to get the SSE4.1 roundss instruction) at particular call
> sights without compiling the entire source file or calling function
> with different CFLAGS.
>
> I attempted this by making inline wrapper functions annotated with
> attribute((optimize(...))), but it appears that the annotation does
> not apply to inline functions? Take for example, ex.c:
>
> #include <math.h>
>
> static inline float __attribute__((optimize("-fno-trapping-math")))
> rintf_wrapper_inline(float x)
> {
> return rintf(x);
> }
>
> float
> rintf_wrapper_inline_call(float x)
> {
> return rintf(x);
> }
>
> float __attribute__((optimize("-fno-trapping-math")))
> rintf_wrapper(float x)
> {
> return rintf(x);
> }
>
> % gcc -O2 -msse4.1 -c ex.c
> % objdump -d ex.o
>
> ex.o: file format elf64-x86-64
>
>
> Disassembly of section .text:
>
> 0000000000000000 <rintf_wrapper_inline_call>:
> 0: e9 00 00 00 00 jmpq 5 <rintf_wrapper_inline_call+0x5>
> 5: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1)
> c: 00 00 00 00
>
> 0000000000000010 <rintf_wrapper>:
> 10: 66 0f 3a 0a c0 04 roundss $0x4,%xmm0,%xmm0
> 16: c3 retq
>
> whereas I expected that rintf_wrapper_inline_call would be the same as
> rintf_wrapper.
>
> I've read that per-function optimization is broken [1]. Is this still
> the case? Is there a way to accomplish what I want?
Not in this way. Once rintf would be inlined the no-trapping-math flag
would be gone.
The only way is to use SSE intrinsics directly here or have the optimized
variant not inlined.
Richard.
> [1] https://gcc.gnu.org/ml/gcc/2012-07/msg00201.html