[Bug tree-optimization/55016] New: request for specific builtins for rcp and rsqrt
vincenzo.innocente at cern dot ch
gcc-bugzilla@gcc.gnu.org
Mon Oct 22 06:44:00 GMT 2012
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55016
Bug #: 55016
Summary: request for specific builtins for rcp and rsqrt
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: vincenzo.innocente@cern.ch
There are cases where the use of approximate rcp and rsqrt suffice.
I wonder if it would be possible to introduce specific "generic" builtins for
"rcp" and "rsqrt" that produce the proper instruction depending on the target
architecture (see,avx etc) and eventually generate vector instruction in a loop
at the moment anything like this is target specific, inefficient and does not
vectorize!
#include <x86intrin.h>
float v0[1024];
float v1[1024];
inline
float rsqrtf( float x ) {
return _mm_cvtss_f32( _mm_rsqrt_ss( _mm_set_ss( x ) ) );
}
void v() {
for(int i=0; i!=1024; ++i)
v0[i] = rsqrtf(v1[i]);
}
More information about the Gcc-bugs
mailing list