Why is the performance of 32bit program worse than 64bit program running on the same 64bit system, They are compiled from same source. Which gcc option can fix it?

David Brown david@westcontrol.com
Tue Mar 25 10:12:00 GMT 2014


On 25/03/14 04:31, Xinrong Fu wrote:
> Hi guys:
>    What does the number of stalled cycles in the CPU pipeline frontend
> means? Why is the stalled frontend cycles of 32bit program more than
> 64bit program's stalled cycles when they running on same 64bit system?
> Is there any gcc options to fix it?
> 

Are you asking why the same program runs faster when compiled as 64-bit
rather than 32-bit?  There are /many/ reasons why 64-bit x86 code can be
faster than 32-bit x86 code - without having any idea about your code,
we can only make general points.  In comparison to 32-bit x86, the
64-bit mode has access to more registers, has wider registers (which
speeds data movement), less complicated instruction decoding and
instruction prefixes, more efficient floating point, and much more
efficient calling conventions.  It has the disadvantage that pointers
take up twice as much data cache and memory bandwidth, as they are twice
the size.

As for gcc options to "fix" it, there is no problem to fix - it is
normal that 64-bit code is a bit more efficient than 32-bit code from
the same program, but details vary according to the code in question.

One thing I notice from your post is that you are compiling without
enabling optimisation, which cripples the compiler's performance.
Enabling "-O2" will probably make your code several times faster (again,
without information on the program, I can only make general statements).
 Different optimisation settings like "-Os", "-O3", and individual
optimisation flags may or may not make the code faster, but "-O2" is a
good start.





More information about the Gcc-help mailing list