This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: MMX built-ins performance oddities


- vector version is about 3% faster than above instead of 10% slower - wow!
So why is gcc 4.0 producing worse code when using intel style intrinsics and why isn't the union version using builtins as fast as using the vector version?

I can answer why unions are slower: that's because they are spilled to memory on every assignment -- GCC 4.0 knows how to replace structs with different scalar variables (one per item), but not unions. GCC 3.4 knew about none of these possibilities.


About why vectors are faster, well, a lot of the vector support has been rewritten in GCC 4.0 so that may be the case.

I do not know exactly why builtins are still slower, but you may want to create a PR and add me on the CC list (bonzini@gcc.gnu.org).

Paolo


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]