This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 1/4] Add the -ftree-loop-if-convert flag.
On Wed, 7 Jul 2010, Sebastian Pop wrote:
> * common.opt (ftree-loop-if-convert): New flag.
> * doc/invoke.texi (ftree-loop-if-convert): Documented.
> * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
> when flag_tree_loop_if_convert is set.
> ---
> gcc/common.opt | 4 ++++
> gcc/doc/invoke.texi | 14 ++++++++++----
> gcc/tree-if-conv.c | 6 +++++-
> 3 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6ca787a..111d7b7 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -653,6 +653,10 @@ fif-conversion2
> Common Report Var(flag_if_conversion2) Optimization
> Perform conversion of conditional jumps to conditional execution
>
> +ftree-loop-if-convert
> +Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
> +Convert conditional jumps in innermost loops to branchless equivalents
> +
> ; -finhibit-size-directive inhibits output of .size for ELF.
> ; This is used only for compiling crtstuff.c,
> ; and it may be extended to other effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d70f130..0847e01 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
> -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
> -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
> -fforward-propagate -ffunction-sections @gol
> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
> -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
> -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
> -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
> -fira-loop-pressure -fno-ira-share-save-slots @gol
> -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
> -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> +-floop-block -floop-interchange -floop-strip-mine @gol
> -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
> -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
> -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
> @@ -382,8 +382,8 @@ Objective-C and Objective-C++ Dialects}.
> -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
> -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
> -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce @gol
> --ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-im @gol
> +-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> +-ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
> -ftree-phiprop -ftree-loop-distribution @gol
> -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
> -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
> Compare the results of several data dependence analyzers. This option
> is used for debugging the data dependence analyzers.
>
> +@item -ftree-loop-if-convert
> +Attempt to transform conditional jumps in the innermost loops to
> +branch-less equivalents. The intent is to remove control-flow from
> +the innermost loops in order to improve the ability of the
> +auto-vectorization pass to handle these loops.
> +
Please state that this is enabled by default if vectorization is enabled.
> @item -ftree-loop-distribution
> Perform loop distribution. This flag can improve cache performance on
> big loop bodies and allow further loop optimizations, like
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index 8d5d226..873cd89 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -1242,7 +1242,11 @@ main_tree_if_conversion (void)
> static bool
> gate_tree_if_conversion (void)
> {
> - return flag_tree_vectorize != 0;
> + if (flag_tree_vectorize
> + && flag_tree_loop_if_convert < 0)
> + flag_tree_loop_if_convert = 1;
Err, no. This should be
return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
|| flag_tree_loop_if_convert == 1);
not set flag_tree_loop_if_convert here.
But on a 2nd thought please follow what -ftree-cselim does, do
Init(2) (ISTR -1 is now problematic for some reason), and in
process_options () set flag_tree_loop_if_convert if it is
equal to AUTODETECT_VALUE (2) to the setting of flag_tree_vectorize.
The gate function then simply can return flag_tree_loop_if_convert.
Ok with that change.
Thanks,
Richard.