This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: Re: Confusing optimization
Am Sonntag, den 09.05.2010, 21:49 -0700 schrieb Ian Lance Taylor:
> Some possibilities are:
>
> 1) Measurement error. Surprisingly often people are not measuring
> what they think they are measuring, and you didn't provide any
> details about how you got your timings.
function is extern "C" and I messure the time outside.
long start=GetTime(); //own function using gettimeofday (returns ms)
A();
long time=GetTime()-start;
//Output time..
So shold be correct..
> 2) Instruction cache effects, if a() and b() call other functions.
> When both are linked together, those other functions will be at
> different addresses, and whether they are contiguous may change,
> all affecting the instruction cache.
> 3) Exact aligment of loop starts may shift when both are linked
> together, affecting the processor's branch optimizers if it hasvar
> any. Similarly, the exact alignment of labels may shift. You can
> control these using gcc options like -falign-functions,
> -falign-jumps, -falign-labels, -falign-loops.
Well I checked out now asm..
.. Ubuntu 10.04 with g++ 4.4.3
First line is function "a" alone 125ms
Second line is function "a" with "b" 130ms
...
.text:00000000004023BA jmp short loc_402425
.text:00000000004023BA jmp short loc_40241F
...
.text:0000000000402425 cmp [rsp+298h+var_118], 3
.text:000000000040241F cmp [rsp+298h+var_118], 3
.text:000000000040242D jz short loc_40243E
.text:0000000000402427 jz short loc_402438
...
LOOP START
...
.text:000000000040243E cmp [rsp+298h+var_110],
0FFFFFEh
.text:0000000000402438 cmp [rsp+298h+var_110],
0FFFFFEh
.text:000000000040244A setle al
.text:0000000000402444 setle al
.text:000000000040244D test al, al
.text:0000000000402447 test al, al
.text:000000000040244F jnz loc_4023BC
.text:0000000000402449 jnz loc_4023BC
...
.text:00000000004023C3 cmp [rsp+298h+var_98], 3
.text:00000000004023BC cmp [rsp+298h+var_58], 3
.text:00000000004023CA jz short loc_4023D9
.text:00000000004023C4 jz short loc_4023D3
...
.text:00000000004023D9 mov rax, [rsp+298h+var_110]
.text:00000000004023D3 mov rax, [rsp+298h+var_110]
.text:00000000004023E1 add [rsp+298h+var_90], rax
.text:00000000004023DB add [rsp+298h+var_50], rax
.text:00000000004023E9 cmp [rsp+298h+var_D8], 3
.text:00000000004023E3 cmp [rsp+298h+var_98], 3
.text:00000000004023F1 jz short loc_4023FF
.text:00000000004023EB jz short loc_4023F9
...
.text:00000000004023FF inc [rsp+298h+var_D0]
.text:00000000004023F9 inc [rsp+298h+var_90]
.text:0000000000402407 cmp [rsp+298h+var_118], 3
.text:0000000000402401 cmp [rsp+298h+var_118], 3
.text:000000000040240F jz short loc_40241D
.text:0000000000402409 jz short loc_402417
...
.text:000000000040241D inc [rsp+298h+var_110]
.text:0000000000402417 inc [rsp+298h+var_110]
.text:0000000000402425 cmp [rsp+298h+var_118], 3
.text:000000000040241F cmp [rsp+298h+var_118], 3
.text:000000000040242D jz short loc_40243E
.text:0000000000402427 jz short loc_402438
...
LOOP END
...
C++ code:
for(i=0L;i<0xFFFFFFL;i++)
{
Temp+=i;
Test++;
}
= ++ += ++ and < are overloaded functions..
So well .. must be the cache or the align..
Maybe I should flip my ifs .. it evertime jz.. hmmm
I hope I answered this mail in the correct way..
This is the 3. time using a mailing list.. :P
Luca.