[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3
Richard Sandiford
richard.sandiford@arm.com
Tue Oct 13 08:07:32 GMT 2020
xiezhiheng <xiezhiheng@huawei.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> Sent: Thursday, August 27, 2020 4:08 PM
>> To: xiezhiheng <xiezhiheng@huawei.com>
>> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> emitted at -O3
>>
>> xiezhiheng <xiezhiheng@huawei.com> writes:
>> > I made two separate patches for these two groups for review purposes.
>> >
>> > Note: Patch for min/max intrinsics should be applied before the patch for
>> rounding intrinsics
>> >
>> > Bootstrapped and tested on aarch64 Linux platform.
>>
>> Thanks, LGTM. Pushed to master.
>>
>> Richard
>
> I made the patch for multiply and multiply accumulator intrinsics.
>
> Note that bfmmlaq intrinsic is special because this instruction ignores the FPCR and does not update the FPSR exception status.
> https://developer.arm.com/docs/ddi0596/h/simd-and-floating-point-instructions-alphabetic-order/bfmmla-bfloat16-floating-point-matrix-multiply-accumulate-into-2x2-matrix
> So I set it to the AUTO_FP flag.
>
> Bootstrapped and tested on aarch64 Linux platform.
Thanks, LGTM. Pushed to trunk.
Richard
More information about the Gcc-patches
mailing list