This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, x86_64]: Provide longlong.h definitions for 128bit operations



On May 16, 2007, at 10:57 PM, Uros Bizjak wrote:


On 5/16/07, Chris Lattner <clattner@apple.com> wrote:

> This patch adds 128bit operations for x86_64 to longlong.h to speed
> up TImode and TFmode arithmetic.

Why not implement these in terms of TImode operations themselves?
Inline asm should be avoided when it can be :)

Unfortunatelly longlong.h definitions require their operands to be in DImode (for x86_64 target). I guess it is not a coincidence, that these definitions can be implemented with exactly one x86 insn (two for carry-propagating addition and subtraction). Asm with one insn is not harmful.

Just combine them together with logical ops. Something like this:


typedef unsigned DI __attribute__((mode(DI)));
typedef unsigned TI __attribute__((mode(TI)));

void test_add(DI AL, DI AH, DI BL, DI BH, DI *RL, DI *RH) {
  TI A = AL | ((TI)AH << 64);
  TI B = AL | ((TI)BH << 64);

  TI C = A + B;
  *RL = (DI)C;
  *RH = (DI)(C >> 64);
}

void test_sub(DI AL, DI AH, DI BL, DI BH, DI *RL, DI *RH) {
  TI A = AL | ((TI)AH << 64);
  TI B = BL | ((TI)BH << 64);

  TI C = A - B;
  *RL = (DI)C;
  *RH = (DI)(C >> 64);
}


Should produce some thing like this:


_test_add:
        addq %rdi, %rdi
        adcq %rsi, %rcx
        movq %rdi, (%r8)
        movq %rcx, (%r9)
        ret


.align 4 .globl _test_sub _test_sub: subq %rdx, %rdi sbbq %rcx, %rsi movq %rdi, (%r8) movq %rsi, (%r9) ret

.subsections_via_symbols


Inline asms are bad for a number of reasons: as Andrew mentioned, they are scheduling barriers, also the compiler can't reason about them, constant fold, simplify, can't know the sizes of the instructions precisely, etc.


If your compiler isn't producing decent code for simple idioms like this, it seems profitable to fix the compiler, then write portable[1] C code to express these things. The compiler already knows how to efficiently implement 128-bit arithmetic, why re-implement the smarts of the compiler in a header?

-Chris

[1] portable across 64-bit GCC targets, not across other compilers of course. However, gcc-style inline asm isn't portable to other compilers either.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]