This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
This patch implements reciprocal sqrt pass. At the a/func(b) -> a*rfunc(b) was put in existing recip pass, where new rsqrt pass processes only sqrt(a/b)->rsqrt(b/a) optimization (as discussed, due to interference with a/b RTL reciprocal expansion, where we can't perform this optimization anymore). sqrt(a)-> a*rsqrt(a) is performed during RTL expansion time due to CSE of a*rsqrtss(a) during NR step.
Regarding implementation with target-independant builtin: Please note that this conversion also applies to _vectorized_ square roots, and these are always defined as target dependant builtins. target-independant builtin would require another codepath, and we loose the ability to check in target-dependant way _if_ we want this reciprocal to be emitted at all (please look into ix86_builtin_reciprocal function, why we skip convresions, i.e. when SSE insn are not available...).
Patch was bootstrapped on x86_64-pc-linux-gnu and i686-pc-linux-gnu and regtested for all default languages.
The patch is complete, comes with documentation and tests, and brings in substantial speedups in -ffast-math float processing. The patch needs a middle-end approval. I hope that all open issues were resolved, so - OK for mainline?
PR middle-end/31723 * hooks.c (hook_tree_tree_bool_null): New hook. * hooks.h (hook_tree_tree_bool_null): Add prototype. * tree-pass.h (pass_convert_to_rsqrt): Declare. * passes.c (init_optimization_passes): Add pass_convert_to_rsqrt. * tree-ssa-math-opts.c (execute_cse_reciprocals): Scan for a/func(b) and convert it to reciprocal a*rfunc(b). (execute_convert_to_rsqrt): New function. (gate_convert_to_rsqrt): New function. (pass_convert_to_rsqrt): New pass definition. * target.h (struct gcc_target): Add builtin_reciprocal. * target-def.h (TARGET_BUILTIN_RECIPROCAL): New define. (TARGET_INITIALIZER): Initialize builtin_reciprocal with TARGET_BUILTIN_RECIPROCAL. * doc/tm.texi (TARGET_BUILTIN_RECIPROCAL): Document.
* config/i386/i386.h (TARGET_RECIP): New define. * config/i386/i386.md (divsf3): Expand by calling ix86_emit_swdivsf for TARGET_SSE_MATH and TARGET_RECIP when flag_unsafe_math_optimizations is set and not optimizing for size. (*rcpsf2_sse): New insn pattern. (*rsqrtsf2_sse): Ditto. (rsqrtsf2): New expander. Expand by calling ix86_emit_swsqrtsf for TARGET_SSE_MATH and TARGET_RECIP when flag_unsafe_math_optimizations is set and not optimizing for size. (sqrt<mode>2): Expand SFmode operands by calling ix86_emit_swsqrtsf for TARGET_SSE_MATH and TARGET_RECIP when flag_unsafe_math_optimizations is set and not optimizing for size. * config/i386/sse.md (divv4sf): Expand by calling ix86_emit_swdivsf for TARGET_SSE_MATH and TARGET_RECIP when flag_unsafe_math_optimizations is set and not optimizing for size. (*sse_rsqrtv4sf2): Do not export. (sqrtv4sf2): Ditto. (sse_rsqrtv4sf2): New expander. Expand by calling ix86_emit_swsqrtsf for TARGET_SSE_MATH and TARGET_RECIP when flag_unsafe_math_optimizations is set and not optimizing for size. (sqrtv4sf2): Ditto. * config/i386/i386.opt (mrecip): New option. * config/i386/i386-protos.h (ix86_emit_swdivsf): Declare. (ix86_emit_swsqrtsf): Ditto. * config/i386/i386.c (IX86_BUILTIN_RSQRTF): New constant. (ix86_init_mmx_sse_builtins): __builtin_ia32_rsqrtf: New builtin definition. (ix86_expand_builtin): Expand IX86_BUILTIN_RSQRTF using ix86_expand_unop1_builtin. (ix86_emit_swdivsf): New function. (ix86_emit_swsqrtsf): Ditto. (ix86_builtin_reciprocal): New function. (TARGET_BUILTIN_RECIPROCAL): Use it. (ix86_vectorize_builtin_conversion): Rename from ix86_builtin_conversion. (TARGET_VECTORIZE_BUILTIN_CONVERSION): Use renamed function. * doc/invoke.texi (Machine Dependent Options): Add -mrecip to "i386 and x86_64 Options" section. (Intel 386 and AMD x86_64 Options): Document -mrecip.
PR middle-end/31723 * gcc.target/i386/recip-divf.c: New test. * gcc.target/i386/recip-sqrtf.c: Ditto. * gcc.target/i386/recip-vec-divf.c: Ditto. * gcc.target/i386/recip-vec-sqrtf.c: Ditto. * gcc.target/i386/sse-recip.c: Ditto.
Attachment:
gcc-recip-4.diff.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |