Finding the optimization that is making the change
Will Hawkins
whh8b@virginia.edu
Fri Aug 11 21:27:00 GMT 2017
On Fri, Aug 11, 2017 at 4:36 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On 11 August 2017 at 19:44, Will Hawkins <whh8b@virginia.edu> wrote:
>> On Fri, Aug 11, 2017 at 1:15 PM, Will Hawkins <whh8b@virginia.edu> wrote:
>>> On Fri, Aug 11, 2017 at 1:14 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>>>> On 11 August 2017 at 18:09, Will Hawkins <whh8b@virginia.edu> wrote:
>>>>> Hello everyone!
>>>>>
>>>>> First, thank you all for your participation in the gcc community -- I
>>>>> firmly believe that one of the great strengths of free software is the
>>>>> community of people that participate in its development, maintenance
>>>>> and support. So, thank you!
>>>>>
>>>>> I have a simple C program and I am attempting to determine which of
>>>>> the optimizations at O1 cause a particular transformation. In order to
>>>>> isolate the optimizations enabled at O1 vs O0, I followed an idea set
>>>>> out in the gcc man page and ran the following command:
>>>>>
>>>>> $ diff <(gcc -Q -O1 --help=optimizers) <(gcc -Q --help=optimizers) |
>>>>> grep enabled | awk '{print $2;}' > optimizations
>>>>>
>>>>> Then I compiled with the following command:
>>>>>
>>>>> gcc -o scfi.poptim `cat optimizations | tr '\n' ' '` scfi.c
>>>>>
>>>>> I compared simple.poptim with simple.optim that came from running this command:
>>>>>
>>>>> gcc -o simple.optim -O1 simple.c
>>>>>
>>>>> I expected that simple.optim and simple.poptim would be (largely)
>>>>> identical. That is not the case, however. It does not look like the
>>>>> scfi.poptim program has been optimized at all.
>>>>
>>>> Because you didn't specify any -O optimization option, which means
>>>> there is no optimization done at all. See
>>>> https://gcc.gnu.org/wiki/FAQ#optimization-options
>>>>
>>>> Options to enable/disable individual optimizations have no effect if
>>>> the optimizers aren't run at all.
>>>>
>>>>> I was wondering if anyone could shed some light on why this is not the
>>>>> case. I ask only because the gcc man page seems to imply that this is
>>>>> the "right" way to isolate the different optimizations performed at
>>>>> different levels.
>>>>
>>>> No, you need to use -O1 -fno-xxx -fno-yyy -fno-zzz
>>>>
>>>> i.e. turn on optimization, then disable individual passes. You can't
>>>> start from nothing and enable individual ones, that gives you nothing.
>>>
>>> Wow! That makes perfect sense. Thank you so much! I will give it a try
>>> and let you know what I find. Thanks for the quick response!
>>
>> I tried your suggestion and I am still getting very odd behavior. I
>> have essentially done the opposite of what I was doing before.
>>
>> gcc -o scfi.poptim -O1 `cat optimizations | sed -e 's/^-f/-fno-/' | tr
>> '\n' ' '` scfi.c
>>
>> which yields the following invocation:
>>
>> gcc -o scfi.poptim -O1 -fno-combine-stack-adjustments
>> -fno-compare-elim -fno-cprop-registers -fno-defer-pop
>> -fno-forward-propagate -fno-guess-branch-probability
>> -fno-if-conversion -fno-if-conversion2
>> -fno-inline-functions-called-once -fno-ipa-profile -fno-ipa-pure-const
>> -fno-ipa-reference -fno-merge-constants -fno-shrink-wrap
>> -fno-split-wide-types -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch
>> -fno-tree-copy-prop -fno-tree-copyrename -fno-tree-dce
>> -fno-tree-dominator-opts -fno-tree-dse -fno-tree-fre -fno-tree-sink
>> -fno-tree-slsr -fno-tree-sra -fno-tree-ter scfi.c
>>
>> I would have expected that to build the (largely) same binary as
>>
>> gcc -o scfi -O0 scfi.c
>>
>> and yet it does not. The former is still optimized and the latter is
>> (obviously) not.
>
> As it says at https://gcc.gnu.org/wiki/FAQ#optimization-options "the
> -Ox flags enable many optimizations that are not controlled by any
> individual -f* option. "
>
> You can't reproduce the effects of -O1 by adding flags to unoptimized
> code, and you can't recreate unoptimized code by disabling individual
> optimizations. Unoptimized code is still completely unoptimized, and
> optimized code is not completely unoptimized.
First of all, thank you for continuing to offer your feedback!
>
> This probably won't stop you isolating which optimization causes the
> effect you're interested in, because it's probably one that is
> controlled by a -f flag. Instead of trying to compare apples and
> oranges (unoptimized and optimized) compare -O1 -fno-xxx -fno-yyy
> -fno-zzz and -O1, and then add/remove those -fno-* flags until you
> find the one that causes the effect you're interested in.
Interestingly enough, I think that it will. Here's why I say that.
I have the following C source code:
int calling_cd(int c_or_d) {
void (*cd)(void) = testing_c;
switch (c_or_d) {
case 1:
cd = testing_c;
break;
case 2:
cd = testing_d;
break;
}
cd();
return 1;
}
When I compile at O0, I get the following:
4005f1: push %rbp
4005f2: mov %rsp,%rbp
4005f5: sub $0x20,%rsp
4005f9: mov %edi,-0x14(%rbp)
4005fc: movq $0x40054d,-0x8(%rbp)
400604: mov -0x14(%rbp),%eax
400607: cmp $0x1,%eax
40060a: je 400613 <calling_cd+0x22>
40060c: cmp $0x2,%eax
40060f: je 40061d <calling_cd+0x2c>
400611: jmp 400626 <calling_cd+0x35>
400613: movq $0x40054d,-0x8(%rbp)
40061b: jmp 400626 <calling_cd+0x35>
40061d: movq $0x40055d,-0x8(%rbp)
400625: nop
400626: mov -0x8(%rbp),%rax
40062a: callq *%rax
40062c: mov $0x1,%eax
400631: leaveq
400632: retq
And, when I compile with O1 and *every* optimization disabled (as I
included in my previous email), I get the following:
400643: sub $0x8,%rsp
400647: mov $0x40059a,%eax
40064c: cmp $0x1,%edi
40064f: je 400658 <calling_cd+0x15>
400651: cmp $0x2,%edi
400654: je 400662 <calling_cd+0x1f>
400656: jmp 400667 <calling_cd+0x24>
400658: mov $0x40059a,%eax
40065d: nopl (%rax)
400660: jmp 400667 <calling_cd+0x24>
400662: mov $0x4005b7,%eax
400667: callq *%rax
400669: mov $0x1,%eax
40066e: add $0x8,%rsp
400672: retq
The difference that I am interested in figuring out is what
"optimization" causes the local variable cd to be stored in a register
(eax) throughout function execution rather than to the stack
(-0x8(rbp)) for every assignment.
As you mentioned, I can definitely walk through the remaining
optimizations to get to the code that is generated at "full" O1 but
that's not really the behavior that I am trying to decipher.
I know that this is probably not a "reasonable" question -- I am not
trying to get some behavior for production, I am more interested in
understanding the compiler/optimizer in a way that I can dig into the
code, if I needed to.
Thanks again for taking the time to correspond. I hope that this is
not a waste of your time.
Have a great afternoon!
Will
>
> If you tell us more about what you're trying to achieve (rather than
> how you're trying to do it) maybe someone can save you some time.
More information about the Gcc-help
mailing list