[Bug tree-optimization/66002] paq8p benchmark 50% slower than clang on sandybridge

trippels at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon May 4 14:11:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66002

--- Comment #6 from Markus Trippelsdorf <trippels at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
>   prephitmp_61 = _53 <= 65535 ? pretmp_60 : -32768;
> 
> is
> 
>      unsigned int <= 65535 ? short int : short int;
> 
> pushing the condition to a separate stmt might get us to support this
> "narrowing" conversion.
> 
> Of course ifcvt does a pretty poor job on this as well...
> 
> We do vectorize
> 
>     for (int i=0; i<n; ++i) {
>         int wt=w[i]+((t[i]*err*2>>16)+1>>1);
>         if (wt<-32768) wt=-32768;
> //      if (wt>32767) wt=32767;
>         w[i]=wt;
>     }
> 
> as if (wt<-32768) wt=-32768; becomes a MAX_EXPR.  Also if I change it to
> 
>     for (int i=0; i<n; ++i) {
>         int wt=w[i]+((t[i]*err*2>>16)+1>>1);
>         if (wt<-32768) wt=-32768;
>         else if (wt>32767) wt=32767;
>         w[i]=wt;
>     }
> 
> we vectorize it as MIN/MAX_EXPRs.
> 
> Maybe you can perform this source change manually and see what it does
> to performance.

With the "else" added gcc beats clang:
./paq8p -4 file1.in  24.81s user 0.10s system 100% cpu 24.902 total



More information about the Gcc-bugs mailing list