A puzzle: different optimization for compound-expressions

Bruno Loff bruno.loff@gmail.com
Sun Nov 1 12:31:00 GMT 2015


Ok, I got it, thank you Marc.

On 31 October 2015 at 21:21, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Sat, 31 Oct 2015, Bruno Loff wrote:
>
>> I am always impressed by the power of the GCC optimizer. Today I found
>> a somewhat surprising abnormality when using compound-expressions.
>> Look at the two definitions for the function f(a) = a*a + a:
>>
>>
>> int64_t f1( int64_t a ) {
>>    return a * a + a;
>> }
>>
>> int64_t f2( int64_t a ) {
>>    return ({
>>        int64_t b;
>>        b = a * a;
>>        ({
>>            int64_t c;
>>            c = b + a;
>>            c;
>>        });
>>    });
>> }
>>
>> I expected that GCC would either make a mess with the second
>> definition, or would smartly produce the same code for both
>> definitions. I was wrong. Here is the (simplified) x86-64 output of
>> with -O3:
>>
>> f1:
>>       leaq    1(%rdi), %rax
>>       imulq   %rdi, %rax
>>       ret
>>
>> f2:
>>       movq    %rdi, %rax
>>       imulq   %rdi, %rax
>>       addq    %rdi, %rax
>>       ret
>>
>>
>>
>> The code for f2 is what I expected, but if I was a little smarter (and
>> knew more asm) I might have instead expected f1. The code for f1
>> basically does
>>
>> b := a + 1
>> b := b * a
>>
>> Whereas the code for f2 does:
>>
>> b := a
>> b := b * a
>> b := b
>>
>> The code for f1 is clearly better, saving on one instruction. They
>> are, of course, completely equivalent.
>
>
> It isn't that obvious to me which version is better, but I agree that both
> should generate the same code.
>
>> So why is GCC failing to optimize the compound expressions all the
>> way? My guess would be that it has to do with the order in which some
>> optimization passes are happening. Anyone?
>
>
> A number of optimizations happen, for historical reasons, during parsing,
> when the front-end calls functions from fold-const.c on expressions. We are
> currently moving many such optimizations to a later stage (using match.pd),
> if this transformation is moved, it will also apply to f2.
>
> -fdump-tree-all can give you a lot of information about the various stages
> of optimization.
>
> --
> Marc Glisse



More information about the Gcc-help mailing list