This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH]: Reciprocal sqrt (rsqrt) conversion pass


On 6/13/07, Uros Bizjak <ubizjak@gmail.com> wrote:
Hello!

Attached RFC patch implements reciprocal pass that converts sqrt to
rsqrt. The pass converts several forms of sqrt:

a / sqrt(b / c) => a * rsqrt (c / b)

a * rsqrt (b / c) I suppose.


sqrt (a / b) => rsqrt ( b / a)

this could be done in sqrt builtin expansion instead? (Of course we may not see the a / b argument there)

sqrt (a) => a * rsqrt (a)

Likewise.


All of the transformations need to be guarded with
flag_unsafe_math_optimizations.

There are actually two passes, one is part of the recip pass,
searching for the first form, and the second is rsqrt pass that
searches for other two forms. Two passes are necessary to prevent
sqrt() in the first form to convert to a * rsqrt(a).

The patch introduces new target dependant hook that returns reciprocal
function of the function, processed by recip/rsqrt pass.

The testcase:

--cut here--
float sqrtf (float);

float t1(float a, float b)
{
  return a/sqrtf(b);
}

float t2(float x, float a, float b)
{
  return sqrtf(a/b);
}

float t3(float a)
{
  return sqrtf(a);
}

float t4(float a, float b)
{
  return a/b;
}
--cut here--

is compiled (-O2 -ffast-math -msse2 -mfpmath=sse -mrecip) into:

t4:
        subl    $4, %esp
        rcpss   12(%esp), %xmm0
        mulss   8(%esp), %xmm0
        movss   %xmm0, (%esp)
        flds    (%esp)
        addl    $4, %esp
        ret

t3:
        subl    $4, %esp
        movss   8(%esp), %xmm0
        rsqrtss %xmm0, %xmm1
        mulss   %xmm1, %xmm0
        movss   %xmm0, (%esp)
        flds    (%esp)
        addl    $4, %esp
        ret

t2:
        subl    $4, %esp
        rcpss   12(%esp), %xmm0
        mulss   16(%esp), %xmm0
        rsqrtss %xmm0, %xmm0
        movss   %xmm0, (%esp)
        flds    (%esp)
        addl    $4, %esp
        ret

t1:
        subl    $4, %esp
        rsqrtss 12(%esp), %xmm0
        mulss   8(%esp), %xmm0
        movss   %xmm0, (%esp)
        flds    (%esp)
        addl    $4, %esp
        ret

This is just a simple rsqrtss/rcpss expansion, the expansion with
NR-enhancement can be (will be) added to RTL expanders. Also, this
pass can convert vectorized forms by changing
ix86_builtin_reciprocal() function.

For some reason, patched gcc ICEs on

float a[16];
float b[16];

void test(void)
{
 int i;

 for (i = 0; i < 16; i++)
   b[i] = 1.0 / sqrtf (a[i]);
}

with:

vsqrt.c: In function 'test':
vsqrt.c:5: error: expected an SSA_NAME object
vsqrt.c:5: error: in statement
# a = VDEF <a>
# b = VDEF <b> { a b }
D.1976_6 = __builtin_ia32_rsqrtf (D.1975_5);
vsqrt.c:5: internal compiler error: verify_ssa failed

I don't know, what causes this ICE, perhaps somebody experienced in
trees can help here...

I bet if you change


def_builtin (OPTION_MASK_ISA_SSE, "__builtin_ia32_rsqrtf", ftype,
IX86_BUILTIN_RSQRTF);

to use def_builtin_const it will work.

Richard.

2007-06-12 Uros Bizjak <ubizjak@gmail.com>

        * targhooks.c (default_builtin_function): Rename from
        default_builtin_vectorized_conversion.
        * targhooks.h (default_builtin_function): Update prototype.
        * tree-pass.h (pass_convert_to_rsqrt): Declare.
        * passes.c (init_optimization_passes): Add pass_convert_to_rsqrt.
        * tree-ssa-math-opts.c (execute_cse_reciprocals): Scan for a/func(b)
        and convert it to reciprocal a*rfunc(b).
        (execute_convert_to_rsqrt): New function.
        (gate_convert_to_rsqrt): New function.
        (pass_convert_to_rsqrt): New pass definition.
        * target.h (struct gcc_target): Add builtin_reciprocal.
        * target-def.h (TARGET_BUILTIN_RECIPROCAL): New define.
        (TARGET_VECTORIZE_BUILTIN_CONVERSION): Use deault_builtin_function.
        (TARGET_INITIALIZER): Initialize builtin_reciprocal with
        TARGET_BUILTIN_RECIPROCAL.

[target stuff]

        * config/i386/i386.md (divsf3): Expand using rcpsf2_sse for
        TARGET_SSE_MATH and TARGET_RECIP.
        (rcpsf2_sse): New insn pattern.
        (rsqrtsf2_sse): Ditto.
        * config/i386/i386.opt (mrecip): New option.
        * config/i386/i386.c (IX86_BUILTIN_RSQRTF): New constant.
        (ix86_init_mmx_sse_builtins): __builtin_ia32_rsqrtf: New
        builtin definition.
        (ix86_expand_builtin): Expand IX86_BUILTIN_RSQRTF using
        ix86_expand_unop1_builtin.
        (ix86_vectorize_builtin_conversion): Rename from
        ix86_builtin_conversion.
        (TARGET_VECTORIZE_BUILTIN_CONVERSION): Use renamed function.
        (ix86_builtin_reciprocal): New function.
        (TARGET_BUILTIN_RECIPROCAL): Use it.

FWIW, patch survives bootstrap and regression test on i686-pc-linux-gnu.

Uros.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]