Question regarding the values of labels

Matthew Plant rookie.mp@gmail.com
Tue Nov 1 23:52:00 GMT 2011


Hello there people who are much smarter than me,

A little background on my problem:
I'm trying to implement a simple JIT compiler, and to do this I
decided the best way would be by using the GNU GCC unary && operator.
Instead of having to write target assembly for each platform my
program is going to be on (along with writing a runtime assembler,
which I do not want to do), I figured I'd take sections of C code
wrapped in labels like so:

block_begin:
 {
   <instruction code>
 }
block_end:;

Then I can simply copy the bytes between block_begin and block_end
into an allocated char array, repeat as necessary for as many blocks
as I am wont, finally cast to a function pointer and voilà, I have a
JIT compiler. Of course control flow is incredibly messy (or an offset
has to be added to all the jump addresses), but it's still decently
simple.

Unfortunately, as I expected it would be, it wasn't actually that
simple. Perhaps there is some inherent flaw in my thinking, but for
now, on to the problem that occurs:

For some reason, the location of labels in memory changes depending on
the control flow of the program. I have this test program which just
prints the addresses of the labels in the program:

#include <stdio.h>
#include <stdlib.h>
int main ()
{
  int i;
 lbl1:
  i = 32;
 lbl2:
  i += 4;
 lbl3:
  i += 5;
  printf ("%lu, %lu, %lu, %lu\n", main, &&lbl1, &&lbl2, &&lbl3);
  return 0;
}

A sample run prints out something like 4195524, 4195532, 4195539,
4195543. That's all good and dandy. The labels are all a reasonable
distance from each other and there are no repeat values.
But say we decide to alter the control flow a bit:

#include <stdio.h>
#include <stdlib.h>
int main ()
{
  int i;
  goto lbl3;
 lbl1:
  i = 32;
  goto done;
 lbl2:
  i += 4;
 lbl3:
  i += 5;
  printf ("%lu, %lu, %lu, %lu\n", main, &&lbl1, &&lbl2, &&lbl3);
 done:;
  return 0;
}

A resulting sample output is 4195524, 4195533, 4195532, 4195533. What
the heck?! Two of the labels have the same address, and the second
label comes before the third and first! This was compiled without any
optimizations and the values did not differ upon adding
__attribute__((__noinline__,__noclone__)) or setting the addresses to
static variables.

So my final question is this: what determines the addresses of labels?
Can this problem be fixed with at least semi-readable code?

Any help would be greatly appreciated.

Sincerely,
Matt



More information about the Gcc-help mailing list