[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3
Richard Sandiford
richard.sandiford@arm.com
Mon Aug 3 13:55:26 GMT 2020
xiezhiheng <xiezhiheng@huawei.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> Sent: Friday, July 31, 2020 5:03 PM
>> To: xiezhiheng <xiezhiheng@huawei.com>
>> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> emitted at -O3
>>
>> xiezhiheng <xiezhiheng@huawei.com> writes:
>> >> -----Original Message-----
>> >> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> >> Sent: Friday, July 17, 2020 5:04 PM
>> >> To: xiezhiheng <xiezhiheng@huawei.com>
>> >> Cc: Richard Biener <richard.guenther@gmail.com>;
>> gcc-patches@gcc.gnu.org
>> >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> >> emitted at -O3
>> >>
>> >
>> > Cut...
>> >
>> >>
>> >> Thanks, pushed to master.
>> >>
>> >> Richard
>> >
>> > And I have finished the second part.
>> >
>> > In function aarch64_general_add_builtin, I add an argument ATTRS to
>> > pass attributes for each built-in function.
>> >
>> > And some new functions are added:
>> > aarch64_call_properties: return flags for each built-in function based
>> > on command-line options. When the built-in function handles
>> > floating-points, add FLAG_FP flag.
>> >
>> > aarch64_modifies_global_state_p: True if the function would modify
>> > global states.
>> >
>> > aarch64_reads_global_state_p: True if the function would read
>> > global states.
>> >
>> > aarch64_could_trap_p: True if the function would raise a signal.
>> >
>> > aarch64_add_attribute: Add attributes in ATTRS.
>> >
>> > aarch64_get_attributes: return attributes for each built-in functons
>> > based on flags and command-line options.
>> >
>> > In function aarch64_init_simd_builtins, attributes are get by flags
>> > and pass them to function aarch64_general_add_builtin.
>> >
>> >
>> > Bootstrap is tested OK on aarch64 Linux platform, but regression
>> > FAIL one test case ---- pr93423.f90.
>> > However, I found that this test case would fail randomly in trunk.
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93423
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96041
>> > Some PRs have tracked it. After my patch, this test case would
>> > always fail. I guess the syntax errors in fortran crash some structures
>> > result in illegal memory access but I can't find what exactly it is.
>> > But I think my patch should have no influence on it.
>>
>> Yeah, I agree. And FWIW, I didn't see this in my testing.
>>
>> I've pushed the patch with one trivial change: to remove the “and”
>> before “CODE” in:
>>
>> > /* Wrapper around add_builtin_function. NAME is the name of the
>> built-in
>> > function, TYPE is the function type, and CODE is the function subcode
>> > - (relative to AARCH64_BUILTIN_GENERAL). */
>> > + (relative to AARCH64_BUILTIN_GENERAL), and ATTRS is the function
>> > + attributes. */
>>
>> BTW, one thing to be careful of in future is that not all FP intrinsics
>> raise FP exceptions. So while:
>>
>> > + switch (d->mode)
>> > + {
>> > + /* Floating-point. */
>> > + case E_BFmode:
>> > + case E_V4BFmode:
>> > + case E_V8BFmode:
>> > + case E_HFmode:
>> > + case E_V4HFmode:
>> > + case E_V8HFmode:
>> > + case E_SFmode:
>> > + case E_V2SFmode:
>> > + case E_V4SFmode:
>> > + case E_DFmode:
>> > + case E_V1DFmode:
>> > + case E_V2DFmode:
>> > + flags |= FLAG_FP;
>> > + break;
>> > +
>> > + default:
>> > + break;
>> > + }
>>
>> is a good, conservatively-correct default, we might need an additional
>> flag to suppress it for certain intrinsics.
>>
>
> I agree.
>
>> I've just realised that the code above could have used FLOAT_MODE_P,
>> but I didn't think of that before pushing the patch :-)
>>
>
> Sorry, I should have used it. And I prepare a patch to use FLOAT_MODE_P
> macro and add a flag FLAG_SUPPRESS_FP_EXCEPTIONS to suppress
> FLAG_RAISE_FP_EXCEPTIONS for certain intrinsics in future.
The same thing is true for reading FPCR as well, so I think the flag
should suppress the FLOAT_MODE_P check, instead of fixing up the flags
afterwards.
I'm struggling to think of a good name though. How about adding
FLAG_AUTO_FP and making the FLOAT_MODE_P check dependent on FLAG_AUTO_FP
being set?
We could leave FLAG_AUTO_FP out of FLAG_ALL, since FLAG_ALL already
includes FLAG_FP. Including it in FLAG_ALL wouldn't do no any harm
though.
Thanks,
Richard
More information about the Gcc-patches
mailing list