This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Complex multiplication in gcc

Ah OK, thank you, I wasn't aware of that particular mechanism.  If I
run the code and break on __mulsc3 it disassembles as I'd expect.

On Mon, Jul 17, 2017 at 12:32 PM, Gabriel Paubert <> wrote:
> On Mon, Jul 17, 2017 at 10:51:21AM -0600, Sean McAllister wrote:
>> When generating code for a simple inner loop (instantiated with
>> std::complex<float>)
>> template <typename cx>
>> void __attribute__((noinline)) benchcore(const cx* __restrict__ aa,
>> const cx* __restrict__ bb, const cx* __restrict__ cc, cx* __restrict__
>> dd, cx uu, cx vv, size_t nn) {
>>     for (ssize_t ii=0; ii < nn; ii++) {
>>         dd[ii] = (
>>             aa[ii]*uu +
>>             bb[ii]*vv +
>>             cc[ii]
>>         );
>>     }
>> }
>> g++ generates the following assembly code (g++ 7.1.0) (compiled with:
>> g++ -I. -O3 -ggdb3 -o test)
> [snipped]
>> The interesting part is the two calls to __mulsc3, which the docs
>> indicate computes complex multiplication according to Annex G of the
>> C99 standard.  This leads me to two questions.
>> First, disassembling __mulsc3 doesn't seem to contain anything:
>> (gdb) disassemble __mulsc3
>> Dump of assembler code for function __mulsc3@plt:
>>    0x0000000000400aa0 <+0>: jmpq   *0x2035d2(%rip)        # 0x604078
>>    0x0000000000400aa6 <+6>: pushq  $0xc
>>    0x0000000000400aab <+11>: jmpq   0x4009d0
>> End of assembler dump.
>> What's the cause of this?
> That you are disassembling the PLT (note __mulsc3@plt), which redirects
> to the real function which is provided by libgcc (on my computer the
> exact location is /lib/x86_64-linux-gnu/
>> Second, since I don't think I'll convince anyone to generate
>> non-standard conforming code by default, could the default performance
>> of complex multiplication be enhanced significantly by performing the
>> isnan() checks required by Annex G and only calling the function to
>> fix the results if they fail?  That would move the function call
>> overhead out of the critical path at least.
>         Gabriel

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]