Choosing the right -march target architecture

Mon Aug 31 11:30:00 GMT 2015


I'm compiling a program which will be distributed in binary-form
(source code is not open at this time). I am trying to pick the
"right" optimization flags.

The program is written in C++ but most of the execution time is spent
in a C library. (I'm using gcc 4.8.4 on amd64-linux-gnu platform.)

I will test -O2 -O3 and -Os

I think -fomit-frame-pointer also helps sometimes?
(It might already be enabled for -O{2,3,s} on amd64?)

I also wanted to specify -march because I think it may allow gcc to
use SSE2. (Although SSE2 may be enabled by default on amd64?)

I've tried -march=core2 but I'm not sure older AMD chips support the same
extensions (SSSE3 for example).

So maybe -march=core2 -mno-ssse3 ?

But after reading the documentation more closely, it appears there is
an option addressing my use-case: -mtune=generic (and in fact, it looks
like Ubuntu's gcc was compiled with --with-tune=generic so this should
be the default, IIUC).

> Produce code optimized for the most common IA32/AMD64/EM64T
> processors. If you know the CPU on which your code will run, then you
> should use the corresponding -mtune or -march option instead of
> -mtune=generic. But, if you do not know exactly what CPU users of
> your application will have, then you should use this option.
> As new processors are deployed in the marketplace, the behavior of
> this option will change. Therefore, if you upgrade to a newer version
> of GCC, code generation controlled by this option will change to
> reflect the processors that are most common at the time that version
> of GCC is released.

Is it documented somewhere how -mtune=generic has changed over releases
4.7, 4.8, 4.9, 5.0 (I'd like to get a feel for what CPUs are targeted).


More information about the Gcc-help mailing list