This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, 2 May 2017, Liu Hao wrote:
This can be observed from the following example: (For your reference: https://godbolt.org/g/toFOVc ) ```c++ #include <emmintrin.h> double my_fmax_1(double x, double y){ return _mm_cvtsd_f64(_mm_max_sd(_mm_set_sd(x), _mm_set_sd(y))); } double my_fmax_2(double x, double y){ double r; __asm__ ( "maxsd %%xmm1, %%xmm0" : "=x"(r) : "0"(x), "x"(y) ); return r; } ```After being compiled with `-O3`, this snippet results in the following assembly:```assembly my_fmax_1(double, double): movsd %xmm0, -24(%rsp) movsd %xmm1, -16(%rsp) movsd -24(%rsp), %xmm0 movsd -16(%rsp), %xmm1 maxsd %xmm1, %xmm0 ret my_fmax_2(double, double): maxsd %xmm1, %xmm0 ret ```The first function seems very inefficient. Are there any particular reasons why GCC doesn't optimize it well (like the second function)
_mm_set_sd is not a NOP, it sets the upper part of the SSE register to 0, which is done with movq in recent versions but through the stack on older versions. In order to optimize that away, the compiler needs to know that the upper part of the registers is ignored (it isn't ignored by max, it is _mm_cvtsd_f64 afterwards that drops anything that depended on it). But the maxsd operation is largely opaque to the compiler for now (modeled in an unnaturally complicated way), so it does not notice it. Clang does a better job there... Feel free to file a bug report at https://gcc.gnu.org/bugzilla/ if you don't already see a similar one in the database.
-- Marc Glisse
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |