[Bug target/14552] compiled trivial vector intrinsic code is inefficient

Thu Mar 20 17:18:00 GMT 2008

------- Comment #35 from michaelni at gmx dot at  2008-03-20 17:18 -------
Subject: Re:  compiled trivial vector intrinsic code is
        inefficient

On Thu, Mar 20, 2008 at 09:49:22AM -0000, ubizjak at gmail dot com wrote:
> 
> 
> ------- Comment #34 from ubizjak at gmail dot com  2008-03-20 09:49 -------
> (In reply to comment #33)
> 
> > Anyway iam glad ffmpeg compiles fine under icc.
> 
> Me to. Now you will troll in their support lists.

No, truth be, i dont plan to switch to icc yet. Somehow i do prefer to use
free tools. Of course if the gap becomes too big i as well as most others
will switch to icc ...
Also ffmpeg uses almost entirely asm() instead of intrinsics so this alone is
not so much a problem for ffmpeg than it is for others who followed the
recommandition of "intrinsics are better than asm".

About trolling, well i made no attempt to reply politely and diplomatic, no.
But "solving" a "problem" in some use case by droping support for that use
case is kinda extreem.

The way i see it is that
* Its non trivial to place emms optimally and automatically
* there needs to be a emms between mmx code and fpu code

The solutions to this would be any one of
A. let the programmer place emms like it has been in the past
B. dont support mmx at all
C. dont support x87 fpu at all
D. place emms after every bunch of mmx instructions
E. solve a quite non trivial problem and place emms optimally

The solution which has been selected apparently is B., why was that choosen?
Instead of lets say A.?

If i do write SIMD code then i do know that i need an emms on x86. Its
trivial for the programmer to place it optimally.

[...]

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552