[Bug tree-optimization/66002] paq8p benchmark 50% slower than clang on sandybridge
trippels at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon May 4 14:11:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66002
--- Comment #6 from Markus Trippelsdorf <trippels at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> prephitmp_61 = _53 <= 65535 ? pretmp_60 : -32768;
>
> is
>
> unsigned int <= 65535 ? short int : short int;
>
> pushing the condition to a separate stmt might get us to support this
> "narrowing" conversion.
>
> Of course ifcvt does a pretty poor job on this as well...
>
> We do vectorize
>
> for (int i=0; i<n; ++i) {
> int wt=w[i]+((t[i]*err*2>>16)+1>>1);
> if (wt<-32768) wt=-32768;
> // if (wt>32767) wt=32767;
> w[i]=wt;
> }
>
> as if (wt<-32768) wt=-32768; becomes a MAX_EXPR. Also if I change it to
>
> for (int i=0; i<n; ++i) {
> int wt=w[i]+((t[i]*err*2>>16)+1>>1);
> if (wt<-32768) wt=-32768;
> else if (wt>32767) wt=32767;
> w[i]=wt;
> }
>
> we vectorize it as MIN/MAX_EXPRs.
>
> Maybe you can perform this source change manually and see what it does
> to performance.
With the "else" added gcc beats clang:
./paq8p -4 file1.in 24.81s user 0.10s system 100% cpu 24.902 total
More information about the Gcc-bugs
mailing list