optimization based on CPU type or flags?
Hei Chan
structurechart@yahoo.com
Sat Apr 6 01:50:00 GMT 2013
Hi,
Thanks for your reply. Very useful.
I haven't tried your perl script yet but it seems promising :)
I found gcc/config/i386/i386.c in GCC trunk. But I don't see anything for corei7, or does it belong to a folder other than corei7?
I am also curious how those cost models are generated. I wonder whether there is a way to generate 1 specific to my machine.
Thanks in advance.
Cheers,
Hei
________________________________
From: Ryan Hill <dirtyepic@gentoo.org>
To: gcc-help@gcc.gnu.org
Sent: Wednesday, April 3, 2013 10:32 PM
Subject: Re: optimization based on CPU type or flags?
On Wed, 3 Apr 2013 20:49:40 -0700 (PDT)
Hei Chan <structurechart@yahoo.com> wrote:
> Hi,
>
> I noticed that I can specify -march=native and -mtune=native.
>
> I have few questions:
> - On some boxes, I noticed that if I specify -march=native and -mtune=native,
> g++ might "convert" -march=native into -march=corei7-avx along with a list of
> flags such as -mcx16 -msahf -mavx -msse4.2 --param l1-cache-size=32 --param
> l1-cache-line-size=64 --param l2-cache-size=10240 (there are a lot more flags
> but I have to take them out as GCC mailing server marked my mail as spam if I
> listed all), and -mtune=generic. how come gcc recognize the CPU type as
> corei7-avx but it can't recognize the CPU type for -mtune?
It turns out that for some cpus -mtune=generic can produce better code than
-mtune=<cpu>. Mostly for cpus that don't have a specifically tuned cost model
IIUC.
> - Let's say g++ figures that my CPU support sse4.1, and how does it choose
> between sse4.1 instructions and other set of instructions (assuming both
> achieve the same task)? Does g++ have a database that it computes the
> numbers of CPU cycles for all the possible combinations before it decides?
Yes there are cost models that are used to decide what instructions to use. See
gcc/config/i386/i386.c for example.
> - Also, how does the l1-cache-size and l2-cache-size parameters affect the
> optimization? Only affecting the decision of inline? Or there are more?
Not sure on that one.
> - Is there any tool that will tell me what CPU specific instructions are used
> (other than going through the object files manually with objdump one by one)?
https://github.com/dirtyepic/scripts/blob/master/analyze-x86 is a clumsy
attempt. It probably misses a lot and I haven't updated it in a couple
releases though. If there's another tool like this out there I'd definitely be
interested.
--
gcc-porting
toolchain, wxwidgets by design, by neglect
@ gentoo.org for a fact or just for effect
More information about the Gcc-help
mailing list