This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, middle-end, i386]: reciprocal rsqrt pass + full recip x86 backend support


Hello!

This patch implements reciprocal sqrt pass. At the a/func(b) ->
a*rfunc(b) was put in existing recip pass, where new rsqrt pass
processes only sqrt(a/b)->rsqrt(b/a) optimization (as discussed, due
to interference with a/b RTL reciprocal expansion, where we can't
perform this optimization anymore). sqrt(a)-> a*rsqrt(a) is performed
during RTL expansion time due to CSE of a*rsqrtss(a) during NR step.

Regarding implementation with target-independant builtin: Please note
that this conversion also applies to _vectorized_ square roots, and
these are always defined as target dependant builtins.
target-independant builtin would require another codepath, and we
loose the ability to check in target-dependant way _if_ we want this
reciprocal to be emitted at all (please look into
ix86_builtin_reciprocal function, why we skip convresions, i.e. when
SSE insn are not available...).

Patch was bootstrapped on x86_64-pc-linux-gnu and i686-pc-linux-gnu
and regtested for all default languages.

The patch is complete, comes with documentation and tests, and brings
in substantial speedups in -ffast-math float processing. The patch
needs a middle-end approval. I hope that all open issues were
resolved, so - OK for mainline?

2007-06-14 Uros Bizjak <ubizjak@gmail.com>

	PR middle-end/31723
	* hooks.c (hook_tree_tree_bool_null): New hook.
	* hooks.h (hook_tree_tree_bool_null): Add prototype.
	* tree-pass.h (pass_convert_to_rsqrt): Declare.
	* passes.c (init_optimization_passes): Add pass_convert_to_rsqrt.
	* tree-ssa-math-opts.c (execute_cse_reciprocals): Scan for a/func(b)
	and convert it to reciprocal a*rfunc(b).
	(execute_convert_to_rsqrt): New function.
	(gate_convert_to_rsqrt): New function.
	(pass_convert_to_rsqrt): New pass definition.
	* target.h (struct gcc_target): Add builtin_reciprocal.
	* target-def.h (TARGET_BUILTIN_RECIPROCAL): New define.
	(TARGET_INITIALIZER): Initialize builtin_reciprocal with
	TARGET_BUILTIN_RECIPROCAL.
	* doc/tm.texi (TARGET_BUILTIN_RECIPROCAL): Document.

	* config/i386/i386.h (TARGET_RECIP): New define.
	* config/i386/i386.md (divsf3): Expand by calling ix86_emit_swdivsf
	for TARGET_SSE_MATH and TARGET_RECIP when
	flag_unsafe_math_optimizations is set and not optimizing for size.
	(*rcpsf2_sse): New insn pattern.
	(*rsqrtsf2_sse): Ditto.
	(rsqrtsf2): New expander.  Expand by calling ix86_emit_swsqrtsf
	for TARGET_SSE_MATH and TARGET_RECIP when
	flag_unsafe_math_optimizations is set and not optimizing for size.
	(sqrt<mode>2): Expand SFmode operands by calling ix86_emit_swsqrtsf
	for TARGET_SSE_MATH and TARGET_RECIP when
	flag_unsafe_math_optimizations is set and not optimizing for size.
	* config/i386/sse.md (divv4sf): Expand by calling ix86_emit_swdivsf
	for TARGET_SSE_MATH and TARGET_RECIP when
	flag_unsafe_math_optimizations is set and not optimizing for size.
	(*sse_rsqrtv4sf2): Do not export.
	(sqrtv4sf2): Ditto.
	(sse_rsqrtv4sf2): New expander.  Expand by calling ix86_emit_swsqrtsf
	for TARGET_SSE_MATH and TARGET_RECIP when
	flag_unsafe_math_optimizations is set and not optimizing for size.
	(sqrtv4sf2): Ditto.
	* config/i386/i386.opt (mrecip): New option.
	* config/i386/i386-protos.h (ix86_emit_swdivsf): Declare.
	(ix86_emit_swsqrtsf): Ditto.
	* config/i386/i386.c (IX86_BUILTIN_RSQRTF): New constant.
	(ix86_init_mmx_sse_builtins): __builtin_ia32_rsqrtf: New
	builtin definition.
	(ix86_expand_builtin): Expand IX86_BUILTIN_RSQRTF using
	ix86_expand_unop1_builtin.
	(ix86_emit_swdivsf): New function.
	(ix86_emit_swsqrtsf): Ditto.
	(ix86_builtin_reciprocal): New function.
	(TARGET_BUILTIN_RECIPROCAL): Use it.
	(ix86_vectorize_builtin_conversion): Rename from
	ix86_builtin_conversion.
	(TARGET_VECTORIZE_BUILTIN_CONVERSION): Use renamed function.
	* doc/invoke.texi (Machine Dependent Options): Add -mrecip to
	"i386 and x86_64 Options" section.
	(Intel 386 and AMD x86_64 Options): Document -mrecip.

testsuite/ChangeLog:

2007-06-14 Uros Bizjak <ubizjak@gmail.com>

	PR middle-end/31723
	* gcc.target/i386/recip-divf.c: New test.
	* gcc.target/i386/recip-sqrtf.c: Ditto.
	* gcc.target/i386/recip-vec-divf.c: Ditto.
	* gcc.target/i386/recip-vec-sqrtf.c: Ditto.
	* gcc.target/i386/sse-recip.c: Ditto.

Uros.

Attachment: gcc-recip-4.diff.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]