[PATCH V3] rs6000: Refine small loop unroll in loop_unroll_adjust hook
Jiufu Guo
guojiufu@linux.ibm.com
Mon Nov 4 09:53:00 GMT 2019
"Kewen.Lin" <linkw@linux.ibm.com> writes:
> Hi Jeff,
>
> Thanks for the patch, I learned a lot from it. Some nits embedded.
Thanks for your comments!
>
> on 2019/11/4 下午2:31, Jiufu Guo wrote:
>> Hi,
>>
>> In this patch, loop unroll adjust hook is introduced for powerpc. We can do
>> target related hueristic adjustment in this hook. In this patch, small loops
>> is unrolled 2 times for O2 and O3 by default. With this patch, we can see
>> some improvement for spec2017. This patch enhanced a little for [Patch V2] to
>> enable small loops unroll for O3 by default like O2.
>>
>> Bootstrapped and regtested on powerpc64le. Is this ok for trunk?
>>
>> Jiufu
>> BR.
>>
>> gcc/
>> 2019-11-04 Jiufu Guo <guojiufu@linux.ibm.com>
>>
>> PR tree-optimization/88760
>> * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
>> code which changes PARAM_MAX_UNROLL_TIMES and PARAM_MAX_UNROLLED_INSNS.
>> (TARGET_LOOP_UNROLL_ADJUST): Add loop unroll adjust hook.
>> (rs6000_loop_unroll_adjust): New hook for loop unroll adjust.
>> Unrolling small loop 2 times for -O2 and -O3.
>> (rs6000_function_specific_save): Save unroll_small_loops flag.
>> (rs6000_function_specific_restore): Restore unroll_small_loops flag.
>> * gcc/config/rs6000/rs6000.opt (unroll_small_loops): New internal flag.
>>
>>
>> gcc.testsuite/
>> 2019-11-04 Jiufu Guo <guojiufu@linux.ibm.com>
>>
>> PR tree-optimization/88760
>> * gcc.dg/pr59643.c: Update back to r277550.
>>
>> ---
>> gcc/config/rs6000/rs6000.c | 38 ++++++++++++++++++++++++++++----------
>> gcc/config/rs6000/rs6000.opt | 7 +++++++
>> gcc/testsuite/gcc.dg/pr59643.c | 3 ---
>> 3 files changed, 35 insertions(+), 13 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 9ed5151..5e1a75d 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -1428,6 +1428,9 @@ static const struct attribute_spec rs6000_attribute_table[] =
>> #undef TARGET_VECTORIZE_DESTROY_COST_DATA
>> #define TARGET_VECTORIZE_DESTROY_COST_DATA rs6000_destroy_cost_data
>>
>> +#undef TARGET_LOOP_UNROLL_ADJUST
>> +#define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust
>> +
>> #undef TARGET_INIT_BUILTINS
>> #define TARGET_INIT_BUILTINS rs6000_init_builtins
>> #undef TARGET_BUILTIN_DECL
>> @@ -4540,25 +4543,20 @@ rs6000_option_override_internal (bool global_init_p)
>> global_options.x_param_values,
>> global_options_set.x_param_values);
>>
>> - /* unroll very small loops 2 time if no -funroll-loops. */
>> + /* If funroll-loops is not enabled explicitly, then enable small loops
>> + unrolling for -O2, and do not turn fweb or frename-registers on. */
>
> "for -O2" -> "for -O2 and up"? since I noticed it checks "optimize
> >=2" later.
Thanks, yes, this would be more clear.
>
>> if (!global_options_set.x_flag_unroll_loops
>> && !global_options_set.x_flag_unroll_all_loops)
>> {
>> - maybe_set_param_value (PARAM_MAX_UNROLL_TIMES, 2,
>> - global_options.x_param_values,
>> - global_options_set.x_param_values);
>> -
>> - maybe_set_param_value (PARAM_MAX_UNROLLED_INSNS, 20,
>> - global_options.x_param_values,
>> - global_options_set.x_param_values);
>> + unroll_small_loops = optimize >= 2 ? 1 : 0;
>
> Maybe simpler with "unroll_small_loops = flag_unroll_loops"?
Both would be ok, I think. ;)
>
>>
>> - /* If fweb or frename-registers are not specificed in command-line,
>> - do not turn them on implicitly. */
>> if (!global_options_set.x_flag_web)
>> global_options.x_flag_web = 0;
>> if (!global_options_set.x_flag_rename_registers)
>> global_options.x_flag_rename_registers = 0;
>> }
>> + else
>> + unroll_small_loops = 0;
>
> Could we initialize this in rs6000.opt as zero?
We could also set initial value as 0 in rs6000.opt.
BR
Jiufu.
>
>
> BR,
> Kewen
More information about the Gcc-patches
mailing list