This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hello!
Attached RFC patch implements reciprocal pass that converts sqrt to rsqrt. The pass converts several forms of sqrt:
a / sqrt(b / c) => a * rsqrt (c / b)
sqrt (a / b) => rsqrt ( b / a)
this could be done in sqrt builtin expansion instead? (Of course we may not see the a / b argument there)
sqrt (a) => a * rsqrt (a)
All of the transformations need to be guarded with flag_unsafe_math_optimizations.
There are actually two passes, one is part of the recip pass, searching for the first form, and the second is rsqrt pass that searches for other two forms. Two passes are necessary to prevent sqrt() in the first form to convert to a * rsqrt(a).
The patch introduces new target dependant hook that returns reciprocal function of the function, processed by recip/rsqrt pass.
The testcase:
--cut here-- float sqrtf (float);
float t1(float a, float b) { return a/sqrtf(b); }
float t2(float x, float a, float b) { return sqrtf(a/b); }
float t3(float a) { return sqrtf(a); }
float t4(float a, float b) { return a/b; } --cut here--
is compiled (-O2 -ffast-math -msse2 -mfpmath=sse -mrecip) into:
t4: subl $4, %esp rcpss 12(%esp), %xmm0 mulss 8(%esp), %xmm0 movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret
t3: subl $4, %esp movss 8(%esp), %xmm0 rsqrtss %xmm0, %xmm1 mulss %xmm1, %xmm0 movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret
t2: subl $4, %esp rcpss 12(%esp), %xmm0 mulss 16(%esp), %xmm0 rsqrtss %xmm0, %xmm0 movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret
t1: subl $4, %esp rsqrtss 12(%esp), %xmm0 mulss 8(%esp), %xmm0 movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret
This is just a simple rsqrtss/rcpss expansion, the expansion with NR-enhancement can be (will be) added to RTL expanders. Also, this pass can convert vectorized forms by changing ix86_builtin_reciprocal() function.
For some reason, patched gcc ICEs on
float a[16]; float b[16];
void test(void) { int i;
for (i = 0; i < 16; i++) b[i] = 1.0 / sqrtf (a[i]); }
with:
vsqrt.c: In function 'test': vsqrt.c:5: error: expected an SSA_NAME object vsqrt.c:5: error: in statement # a = VDEF <a> # b = VDEF <b> { a b } D.1976_6 = __builtin_ia32_rsqrtf (D.1975_5); vsqrt.c:5: internal compiler error: verify_ssa failed
I don't know, what causes this ICE, perhaps somebody experienced in trees can help here...
def_builtin (OPTION_MASK_ISA_SSE, "__builtin_ia32_rsqrtf", ftype, IX86_BUILTIN_RSQRTF);
2007-06-12 Uros Bizjak <ubizjak@gmail.com>
* targhooks.c (default_builtin_function): Rename from default_builtin_vectorized_conversion. * targhooks.h (default_builtin_function): Update prototype. * tree-pass.h (pass_convert_to_rsqrt): Declare. * passes.c (init_optimization_passes): Add pass_convert_to_rsqrt. * tree-ssa-math-opts.c (execute_cse_reciprocals): Scan for a/func(b) and convert it to reciprocal a*rfunc(b). (execute_convert_to_rsqrt): New function. (gate_convert_to_rsqrt): New function. (pass_convert_to_rsqrt): New pass definition. * target.h (struct gcc_target): Add builtin_reciprocal. * target-def.h (TARGET_BUILTIN_RECIPROCAL): New define. (TARGET_VECTORIZE_BUILTIN_CONVERSION): Use deault_builtin_function. (TARGET_INITIALIZER): Initialize builtin_reciprocal with TARGET_BUILTIN_RECIPROCAL.
[target stuff]
* config/i386/i386.md (divsf3): Expand using rcpsf2_sse for TARGET_SSE_MATH and TARGET_RECIP. (rcpsf2_sse): New insn pattern. (rsqrtsf2_sse): Ditto. * config/i386/i386.opt (mrecip): New option. * config/i386/i386.c (IX86_BUILTIN_RSQRTF): New constant. (ix86_init_mmx_sse_builtins): __builtin_ia32_rsqrtf: New builtin definition. (ix86_expand_builtin): Expand IX86_BUILTIN_RSQRTF using ix86_expand_unop1_builtin. (ix86_vectorize_builtin_conversion): Rename from ix86_builtin_conversion. (TARGET_VECTORIZE_BUILTIN_CONVERSION): Use renamed function. (ix86_builtin_reciprocal): New function. (TARGET_BUILTIN_RECIPROCAL): Use it.
FWIW, patch survives bootstrap and regression test on i686-pc-linux-gnu.
Uros.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |