This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/81478] By default, GCC emits a function call for complex multiplication, should partially inline that
- From: "smcallis at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 19 Jul 2017 14:03:10 +0000
- Subject: [Bug middle-end/81478] By default, GCC emits a function call for complex multiplication, should partially inline that
- Auto-submitted: auto-generated
- References: <bug-81478-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81478
--- Comment #4 from Sean McAllister <smcallis at gmail dot com> ---
Looking at the assembly for the __mulsc3 function:
<+0>:movaps %xmm0,%xmm10
<+4>:movaps %xmm2,%xmm11
<+8>:movaps %xmm0,%xmm5
<+11>:mulss %xmm3,%xmm10
<+16>:movaps %xmm1,%xmm6
<+19>:mulss %xmm1,%xmm11
<+24>:mulss %xmm2,%xmm5
<+28>:mulss %xmm3,%xmm6
<+32>:movaps %xmm10,%xmm4
<+36>:addss %xmm11,%xmm4
<+41>:movaps %xmm5,%xmm9
<+45>:subss %xmm6,%xmm9
<+50>:ucomiss %xmm4,%xmm4
<+53>:setp %al
<+56>:ucomiss %xmm9,%xmm9
<+60>:setp %dl
<+63>:and %dl,%al
<+65>:jne 0x7ffff7530a27 <__mulsc3+87>
<snip>
The isnan(a) && isnan(b) isn't short-circuited. It'd be possible to write
something like this:
<snip>
<+50>:ucomiss %xmm4,%xmm4
<+53>:setp %al
<+XX>:je good_cxmultiply
<+XX>:ucomiss %xmm9,%xmm9
<+XX>:setp %dl
<+XX>:and %dl,%al
<+XX>:jne 0x7ffff7530a27 <__mulsc3+XX>
<+XX>good_cxmultiply:
<snip>
This makes the overhead in the general case three pretty cheap instructions
instead of 6 (also very cheap), someone smarter than me will have to decide if
that's a net win or not. Also emitting the code instead of calling __mulsc3
every time will also benefit the register allocator and give it options for
shuffling things around. (I do a lot of complex arithmetic so I'm interested in
this being fast =D). It'd be cool if the vectorizer still had a shot at it,
but I don't immediately see an easy way to achieve that.