This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x86 code generation question


>>>>> "Vadim" == Vadim Lobanov <vadim@cs.washington.edu> writes:

> I've been looking at creating a rather simple macro, call it sel(x, y, s), 
> that simply returns x when s == 0, or y when s == 1. The easy and 
> straightforward way to write this macro is, of course:
>   #define sel(x, y, s) ((s) ? (y) : (x))
> But then again, if I am not mistaken, this kind of construction will cause 
> the processor to jump, when selecting based on s. So, after a bit of 
> thought, we can write the same macro differently, using only straight-line 
> code:
>   #define sel(x, y, s) ((x) + (((y) - (x)) & (-(s))))
> Ah, that should be better.

That isn't straight line code.  The reason is that the x86 doesn't
have instructions that turn comparisons into the integers 0 and 1.  So
when you ask the compiler to do that, it has to generate conditional
jumps.  

> But this is where I am not exactly sure what is going on. In some code, I 
> have a line that says:
>   y = sel(3, 7, (x > 11));
> When I compile the simple macro with "gcc -Wall -S test.c", I get assembly 
> that uses jumps, exactly as expected:
>        cmpl $11, -4(%ebp)
>        jle .L2
>        movl $7, -12(%ebp)
>        jmp .L3
>   .L2: movl $3, -12(%ebp)
>   .L3: ... rest of code
> When I compile the complex version of the sel macro, I get the very same 
> code.
> Additionally, when I compile both of these macros with
> "gcc -Wall -O -S test.c", they again generate the same code, which is now 
> straight-lined.
> 
> Thus, my question: Why would the second macro cause gcc to use the exact 
> same code as the first, for both optimization levels. It seems that the 
> more complex expression of the sel macro should preclude gcc from using 
> jumps, given that it was already written straight-line. I know I'm missing 
> something important in my understanding, and there is a reason for this, 
> so please let me know. :)

The optimizer can see that the 0/1 result of the second version, when
fed into the complicated expression you wrote, transforms to the
equivalent of the first. 

Given a machine with conditional move instructions, either version
should translate into straight line code with conditional moves, but
(apparently -- I'm no x86 expert) the x86 isn't one of those.  Arm and
Alpha are, though.

The moral of the story: the optimizer is pretty good.  Trying to fake
it into generating "more optimal" code by tricks such as you tried
probably won't work.

If the machine DOES have conditional moves but the optimizer doesn't
seem to know that, assembly language is a last resort... but before
you do that, make sure you told the compiler everything it needs to
know, for example which flavor processor it should compile for.

	 paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]