This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/4] Add the -ftree-loop-if-convert flag.


On Wed, 7 Jul 2010, Sebastian Pop wrote:

> 	* common.opt (ftree-loop-if-convert): New flag.
> 	* doc/invoke.texi (ftree-loop-if-convert): Documented.
> 	* tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
> 	when flag_tree_loop_if_convert is set.
> ---
>  gcc/common.opt      |    4 ++++
>  gcc/doc/invoke.texi |   14 ++++++++++----
>  gcc/tree-if-conv.c  |    6 +++++-
>  3 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6ca787a..111d7b7 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -653,6 +653,10 @@ fif-conversion2
>  Common Report Var(flag_if_conversion2) Optimization
>  Perform conversion of conditional jumps to conditional execution
>  
> +ftree-loop-if-convert
> +Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
> +Convert conditional jumps in innermost loops to branchless equivalents
> +
>  ; -finhibit-size-directive inhibits output of .size for ELF.
>  ; This is used only for compiling crtstuff.c,
>  ; and it may be extended to other effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d70f130..0847e01 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
>  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffunction-sections @gol
> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
>  -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fira-loop-pressure -fno-ira-share-save-slots @gol
>  -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> +-floop-block -floop-interchange -floop-strip-mine @gol
>  -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
>  -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
> @@ -382,8 +382,8 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
>  -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
>  -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
> --ftree-copyrename -ftree-dce @gol
> --ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-im @gol
> +-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> +-ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
>  -ftree-phiprop -ftree-loop-distribution @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-pta -ftree-reassoc @gol
> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>  
> +@item -ftree-loop-if-convert
> +Attempt to transform conditional jumps in the innermost loops to
> +branch-less equivalents.  The intent is to remove control-flow from
> +the innermost loops in order to improve the ability of the
> +auto-vectorization pass to handle these loops.
> +

Please state that this is enabled by default if vectorization is enabled.

>  @item -ftree-loop-distribution
>  Perform loop distribution.  This flag can improve cache performance on
>  big loop bodies and allow further loop optimizations, like
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index 8d5d226..873cd89 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -1242,7 +1242,11 @@ main_tree_if_conversion (void)
>  static bool
>  gate_tree_if_conversion (void)
>  {
> -  return flag_tree_vectorize != 0;
> +  if (flag_tree_vectorize
> +      && flag_tree_loop_if_convert < 0)
> +    flag_tree_loop_if_convert = 1;

Err, no.  This should be

  return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
          || flag_tree_loop_if_convert == 1);

not set flag_tree_loop_if_convert here.

But on a 2nd thought please follow what -ftree-cselim does, do
Init(2) (ISTR -1 is now problematic for some reason), and in
process_options () set flag_tree_loop_if_convert if it is
equal to AUTODETECT_VALUE (2) to the setting of flag_tree_vectorize.

The gate function then simply can return flag_tree_loop_if_convert.

Ok with that change.

Thanks,
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]