Poor man's JIT compiler
Robert Bernecky
bernecky@snakeisland.com
Wed Sep 2 22:32:00 GMT 2009
Hi, Dean.
My initial attempt at compiler options was just -O0.
That resulted in the jmp insertion problem, so I conjectured
that there might be some alignment requirements/desires that
would result in jmp instructions being added to make each
labeled fragment start on an "appropriate" boundary.
Clearly, the no-align options did not help.
So, I just tried out your suggestion:
#define OP(nm, cod) \
FS##nm: cod \
asm("nop" : : ); \
FE##nm:
This has the effect of inserting a NOP at the end of each code fragment.
And, it DOES appear to work (although I just quickly eyeballed
the asm code, so I might be missing something). I'll give
it more a careful workover tomorrow. (That was WITH the current
-noalign options still active.)
Now, what was it that led you to propose that inserting a NOP
would have the desired effect?
Many thanks for your reply!
Robert
Dean Anderson wrote:
> I suspect it does this because of instruction alignment and pipelining
> issues. Why are you trying to turn off alignment?
>
> You might try adding a nop after each one.
>
> --Dean
>
> On Tue, 1 Sep 2009, Robert Bernecky wrote:
>
>> I'm trying to get gcc version 4.3.2 to emit X86-64 code
>> fragments that I can catenate to perform my own JIT
>> compilation, but the compiler is being recalcitrant.
>>
>> (I was using a jump table, but its performance was underwhelming.)
>>
>> Roughly, what I've done is to create a set of code fragments,
>> with labels so that I can determine their address ( via &&label)
>> and length. E.g.,
>>
>> topLoad1: reg1 = x[i];
>> botLoad1:
>>
>> topLoad2: reg2 = y[i];
>> botLoad2:
>>
>> topAdd: regz = reg1 + reg2;
>> BotAdd:
>>
>> topStore: z[i] = regz;
>> botStore:
>>
>> Then, I have a table of fragment addresses (topLoad1, topLoad2, etc.)
>> and lengths (botLoad1-topLoad1, botLoad2-topLoad2), and a
>> (unknown statically) list of fragments to be assembled to build
>> working code, e.g.:
>>
>> (Load2, Load1, Add, Store, Loop)
>>
>> I assemble the fragments into a code buffer and jump to it,
>> or so the story goes. Unfortunately, what I'm seeing in the
>> generated code fragments is not fun:
>>
>> 1. GCC sometimes, but NOT always, inserts jumps to the next
>> fragment. E.g.:
>>
>> ----------------------------------------------
>>
>> .L46:
>> .loc 2 34 0
>> movq -264(%rbp), %rax
>> movq %rax, -40(%rbp)
>> .L47:
>> .L7:
>> .loc 2 40 0
>> movl %r8d, %eax
>> jmp .L48
>> .L6:
>> .L48:
>> .loc 2 43 0
>> movl %r11d, %ecx
>> .L49:
>> .L50:
>> ----------------------------------------------
>>
>> Note the jmp .L48. If GCC always inserted a jump, I could
>> remove it, or if it never inserted the jump, I'd be even
>> happier, but it only does it now and then. I tried adding
>> my own jumps to force this:
>>
>> topLoad2: reg2 = y[i];
>> goto botLoad2;
>> botLoad2:
>>
>> but GCC removed them. And inserted others.
>>
>> Today, I'm using these compiler options:
>>
>> gcc -O0 -ggdb -mtune=opteron -fno-align-labels -fno-align-jumps
>>
>> So, I welcome suggestions on how to solve or work around these
>> problems. Or even a completely different approach.
>>
>> Thanks,
>> Robert
>>
>>
>>
>>
>
More information about the Gcc-help
mailing list