[Bug tree-optimization/55016] New: request for specific builtins for rcp and rsqrt

vincenzo.innocente at cern dot ch gcc-bugzilla@gcc.gnu.org
Mon Oct 22 06:44:00 GMT 2012


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55016

             Bug #: 55016
           Summary: request for specific builtins for rcp and rsqrt
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: vincenzo.innocente@cern.ch


There are cases where the use of approximate rcp and rsqrt suffice.

I wonder if it would be possible to introduce specific "generic" builtins for
"rcp" and "rsqrt" that produce the proper instruction depending on the target
architecture (see,avx etc) and eventually generate vector instruction in a loop

at the moment anything like this is target specific, inefficient and does not
vectorize!

#include <x86intrin.h>
float v0[1024];
float v1[1024];
inline
float rsqrtf( float x ) {
  return _mm_cvtss_f32( _mm_rsqrt_ss( _mm_set_ss( x ) ) );
}
void v() {
  for(int i=0; i!=1024; ++i)
    v0[i] = rsqrtf(v1[i]);
}



More information about the Gcc-bugs mailing list