[Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c
Fri Apr 26 15:02:00 GMT 2019
On 4/26/19 8:52 AM, Richard Earnshaw (lists) wrote:
> On 26/04/2019 15:13, Jeff Law wrote:
>> On 4/16/19 10:29 AM, Steve Ellcey wrote:
>>> Re-ping. I know there are discussions about bigger changes to fix the
>>> various failures listed in PR rtl-optimization/87763 but this patch
>>> at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c).
>>> Steve Ellcey
>> So here's my take on fixing the lsl_asr_sbfix.c regression.
>> As I mentioned earlier the problem is the aarch64 is using the old way
>> of describing bitfield extractions (extv, extzv). Specifically it
>> defined a single expander that operated on the natural word mode (DImode).
>> This forces the input operand to be used in DImode as well. So we get
>> those annoying subregs in the expressions that combine generates. But
>> there's a better way :-)
>> The new way to handle this stuff is to define expanders for supported
>> modes (SI and DI for aarch64). Interestingly enough the aarch64 port
>> already does this for bitfield insertions via insv.
>> So I made the obvious changes to the extv/extzv expander. That fixes
>> the lsl_asr_sbfiz regression. But doesn't bootstrap. The availability
>> of the new expander makes extract_bit_field_using_extv want to extract a
>> field from an HFmode object and shove it into an SImode. That runs
>> afoul of checks in validate_subreg (as it should). We shouldn't be
>> using subregs to change the size of a floating point object, so that
>> needs to be filtered out in extract_bit_field_using_extv.
>> That fixes the bootstrap issue. But regression testing trips over
>> ashtidisi.c. That can be easily fixed by allowing zero/sign extractions
>> which change size. ie:
>> (set (reg:DI) (sign_extract:DI (reg:SI) ...)))
>> Which seems like a reasonable thing to handle.
>> So here's what I've got. I've bootstrapped and regression tested on
>> aarch64. It's also bootstrapped on aarch64_be for good measure.
>> OK (from aarch64 maintainers) for the gcc-9 branch and trunk? Or we
>> could just address on the trunk for gcc-10. I don't have strong
>> preferences either way.
>> ps. Time to return insv regressions and address Segher's comments :-)
>> * aarch64.md (extv, extzv expander): Generalize so that it works
>> for both SImode and DImode.
>> (extv_3264, extzv_3264): New pattern to handle extractions with
>> size change between the input and output operand.
>> * expmed.c (extract_bitfield_using_extv): Do not allow changing
>> sizes of floating point objects using SUBREGs.
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index 5a1894063a1..13e2bca05a1 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -5390,16 +5390,16 @@
>> ;; Bitfields
>> ;; -------------------------------------------------------------------
>> -(define_expand "<optab>"
>> - [(set (match_operand:DI 0 "register_operand" "=r")
>> - (ANY_EXTRACT:DI (match_operand:DI 1 "register_operand")
>> +(define_expand "<optab><mode>"
>> + [(set (match_operand:GPI 0 "register_operand" "=r")
>> + (ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand")
>> (match_operand 2
>> - "aarch64_simd_shift_imm_offset_di")
>> - (match_operand 3 "aarch64_simd_shift_imm_di")))]
>> + "aarch64_simd_shift_imm_offset_<mode>")
>> + (match_operand 3 "aarch64_simd_shift_imm_<mode>")))]
>> if (!IN_RANGE (INTVAL (operands) + INTVAL (operands),
>> - 1, GET_MODE_BITSIZE (DImode) - 1))
>> + 1, GET_MODE_BITSIZE (<MODE>mode) - 1))
>> @@ -5418,6 +5418,21 @@
>> [(set_attr "type" "bfx")]
>> +;; Similar to the previous pattern, but 32->64 extraction
>> +(define_insn "*<optab>_3264"
>> + [(set (match_operand:DI 0 "register_operand" "=r")
>> + (ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r")
> So is that valid RTL (DI extract of SI value)? Why wouldn't that just
> use a paradoxical subreg to widen the register being extracted?
It wasn't clear to me -- I couldn't find anything which indicated it
wasn't valid. Given that the RTL code explicitly sign/zero extends the
value we should have the freedom to have the destination be wider.
No idea why combine didn't wrap it in a subreg when it didn't match the
first time around. I suspect it's never been worth doing until we added
the multi-mode extraction expander support and the targets where we're
using multi-mode extraction expanders didn't have a testcase where it'd
More information about the Gcc-patches