This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64] PR84114: Avoid reassociating FMA


On Mon, Feb 26, 2018 at 11:25 PM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
> On Thu, Feb 22, 2018 at 11:38:03AM +0000, Wilco Dijkstra wrote:
>> As discussed in the PR, the reassociation phase runs before FMAs are formed
>> and so can significantly reduce FMA opportunities.  Although reassociation
>> could be switched off, it helps in many cases, so a better alternative is to
>> only avoid reassociation of floating point additions.  This fixes the testcase
>> and gives 1% speedup on SPECFP2017, fixing the performance regression.
>>
>> OK for commit?
>
> This is OK as a fairly safe fix for stage 4. We should fix reassociation
> properly in GCC 9.

It happens that on some targets doing two FMAs in parallel and one
non-FMA operation merging them is faster than chaining three FMAs...

But yes, somewhere I suggested that FMA detection should/could be
integrated with reassociation.

Richard.

> Thanks,
> James
>
>>
>> ChangeLog:
>> 2018-02-23  Wilco Dijkstra  <wdijkstr@arm.com>
>>
>>       PR tree-optimization/84114
>>       * config/aarch64/aarch64.c (aarch64_reassociation_width)
>>       Avoid reassociation of FLOAT_MODE addition.
>> --
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index b3d5fde171920e5759046a4bd61cfcf9eb78d7dd..5f9541cf700aaf18c1f1ac73054614e2932781e4 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -1109,15 +1109,16 @@ aarch64_min_divisions_for_recip_mul (machine_mode mode)
>>    return aarch64_tune_params.min_div_recip_mul_df;
>>  }
>>
>> +/* Return the reassociation width of treeop OPC with mode MODE.  */
>>  static int
>> -aarch64_reassociation_width (unsigned opc ATTRIBUTE_UNUSED,
>> -                          machine_mode mode)
>> +aarch64_reassociation_width (unsigned opc, machine_mode mode)
>>  {
>>    if (VECTOR_MODE_P (mode))
>>      return aarch64_tune_params.vec_reassoc_width;
>>    if (INTEGRAL_MODE_P (mode))
>>      return aarch64_tune_params.int_reassoc_width;
>> -  if (FLOAT_MODE_P (mode))
>> +  /* Avoid reassociating floating point addition so we emit more FMAs.  */
>> +  if (FLOAT_MODE_P (mode) && opc != PLUS_EXPR)
>>      return aarch64_tune_params.fp_reassoc_width;
>>    return 1;
>>  }


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]