[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

Richard Sandiford richard.sandiford@arm.com
Mon Aug 3 13:55:26 GMT 2020


xiezhiheng <xiezhiheng@huawei.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> Sent: Friday, July 31, 2020 5:03 PM
>> To: xiezhiheng <xiezhiheng@huawei.com>
>> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> emitted at -O3
>> 
>> xiezhiheng <xiezhiheng@huawei.com> writes:
>> >> -----Original Message-----
>> >> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> >> Sent: Friday, July 17, 2020 5:04 PM
>> >> To: xiezhiheng <xiezhiheng@huawei.com>
>> >> Cc: Richard Biener <richard.guenther@gmail.com>;
>> gcc-patches@gcc.gnu.org
>> >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
>> >> emitted at -O3
>> >>
>> >
>> > Cut...
>> >
>> >>
>> >> Thanks, pushed to master.
>> >>
>> >> Richard
>> >
>> > And I have finished the second part.
>> >
>> > In function aarch64_general_add_builtin, I add an argument ATTRS to
>> > pass attributes for each built-in function.
>> >
>> > And some new functions are added:
>> > aarch64_call_properties: return flags for each built-in function based
>> > on command-line options.  When the built-in function handles
>> > floating-points, add FLAG_FP flag.
>> >
>> > aarch64_modifies_global_state_p: True if the function would modify
>> > global states.
>> >
>> > aarch64_reads_global_state_p: True if the function would read
>> > global states.
>> >
>> > aarch64_could_trap_p: True if the function would raise a signal.
>> >
>> > aarch64_add_attribute: Add attributes in ATTRS.
>> >
>> > aarch64_get_attributes: return attributes for each built-in functons
>> > based on flags and command-line options.
>> >
>> > In function aarch64_init_simd_builtins, attributes are get by flags
>> > and pass them to function aarch64_general_add_builtin.
>> >
>> >
>> > Bootstrap is tested OK on aarch64 Linux platform, but regression
>> > FAIL one test case ---- pr93423.f90.
>> > However, I found that this test case would fail randomly in trunk.
>> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93423
>> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96041
>> > Some PRs have tracked it.  After my patch, this test case would
>> > always fail.  I guess the syntax errors in fortran crash some structures
>> > result in illegal memory access but I can't find what exactly it is.
>> > But I think my patch should have no influence on it.
>> 
>> Yeah, I agree.  And FWIW, I didn't see this in my testing.
>> 
>> I've pushed the patch with one trivial change: to remove the “and”
>> before “CODE” in:
>> 
>> >  /* Wrapper around add_builtin_function.  NAME is the name of the
>> built-in
>> >     function, TYPE is the function type, and CODE is the function subcode
>> > -   (relative to AARCH64_BUILTIN_GENERAL).  */
>> > +   (relative to AARCH64_BUILTIN_GENERAL), and ATTRS is the function
>> > +   attributes.  */
>> 
>> BTW, one thing to be careful of in future is that not all FP intrinsics
>> raise FP exceptions.  So while:
>> 
>> > +  switch (d->mode)
>> > +    {
>> > +    /* Floating-point.  */
>> > +    case E_BFmode:
>> > +    case E_V4BFmode:
>> > +    case E_V8BFmode:
>> > +    case E_HFmode:
>> > +    case E_V4HFmode:
>> > +    case E_V8HFmode:
>> > +    case E_SFmode:
>> > +    case E_V2SFmode:
>> > +    case E_V4SFmode:
>> > +    case E_DFmode:
>> > +    case E_V1DFmode:
>> > +    case E_V2DFmode:
>> > +      flags |= FLAG_FP;
>> > +      break;
>> > +
>> > +    default:
>> > +      break;
>> > +    }
>> 
>> is a good, conservatively-correct default, we might need an additional
>> flag to suppress it for certain intrinsics.
>> 
>
> I agree.
>
>> I've just realised that the code above could have used FLOAT_MODE_P,
>> but I didn't think of that before pushing the patch :-)
>> 
>
> Sorry, I should have used it.  And I prepare a patch to use FLOAT_MODE_P
> macro and add a flag FLAG_SUPPRESS_FP_EXCEPTIONS to suppress
> FLAG_RAISE_FP_EXCEPTIONS for certain intrinsics in future.

The same thing is true for reading FPCR as well, so I think the flag
should suppress the FLOAT_MODE_P check, instead of fixing up the flags
afterwards.

I'm struggling to think of a good name though.  How about adding
FLAG_AUTO_FP and making the FLOAT_MODE_P check dependent on FLAG_AUTO_FP
being set?

We could leave FLAG_AUTO_FP out of FLAG_ALL, since FLAG_ALL already
includes FLAG_FP.  Including it in FLAG_ALL wouldn't do no any harm
though.

Thanks,
Richard


More information about the Gcc-patches mailing list