[Aarch64][PATCH] Improve Logical And Immediate Expressions
Michael Collison
Michael.Collison@arm.com
Wed Nov 23 16:30:00 GMT 2016
Hi Christophe,
This is not a regression per se; the patch causes the test case to generate one less instruction overall, but one additional 'and'. Trunk before the patch (-O2):
foo:
and w0, w0, 255
lsl w1, w0, 20
orr w0, w1, w0, lsl 8
mov w1, 65024
movk w1, 0xfe0, lsl 16
and w0, w0, w1
ret
The new trunk aarch64 compiler with the patch generates fewer instructions but one additional 'and'
foo:
and w0, w0, 255
lsl w1, w0, 20
orr w0, w1, w0, lsl 8
and x0, x0, 268434944
and x0, x0, -2031617
ret
-----Original Message-----
From: Christophe Lyon [mailto:christophe.lyon@linaro.org]
Sent: Wednesday, November 23, 2016 6:42 AM
To: James Greenhalgh
Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd
Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions
Hi Michael,
On 21 November 2016 at 10:52, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> On Fri, Nov 18, 2016 at 07:41:58AM +0000, Michael Collison wrote:
>> James,
>>
>> I incorporated all your suggestions, and successfully bootstrapped
>> and re-ran the testsuite.
>>
>> Okay for trunk?
>>
>> 2016-11-18 Michael Collison <michael.collison@arm.com>
>>
>> * config/aarch64/aarch64-protos.h
>> (aarch64_and_split_imm1, aarch64_and_split_imm2)
>> (aarch64_and_bitmask_imm): New prototypes
>> * config/aarch64/aarch64.c (aarch64_and_split_imm1):
>> New overloaded function to create bit mask covering the
>> lowest to highest bits set.
>> (aarch64_and_split_imm2): New overloaded functions to create bit
>> mask of zeros between first and last bit set.
>> (aarch64_and_bitmask_imm): New function to determine if a integer
>> is a valid two instruction "and" operation.
>> * config/aarch64/aarch64.md:(and<mode>3): New define_insn and _split
>> allowing wider range of constants with "and" operations.
>> * (ior<mode>3, xor<mode>3): Use new LOGICAL2 iterator to prevent
>> "and" operator from matching restricted constant range used for
>> ior and xor operators.
>> * config/aarch64/constraints.md (UsO constraint): New SImode constraint
>> for constants in "and" operantions.
>> (UsP constraint): New DImode constraint for constants in "and" operations.
>> * config/aarch64/iterators.md (lconst2): New mode iterator.
>> (LOGICAL2): New code iterator.
>> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New
>> predicate
>> (aarch64_logical_and_operand): New predicate allowing extended constants
>> for "and" operations.
>> * testsuite/gcc.target/aarch64/and_const.c: New test to verify
>> additional constants are recognized and fewer instructions generated.
>> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify
>> additional constants are recognized and fewer instructions generated.
>>
>
>> diff --git a/gcc/config/aarch64/aarch64-protos.h
>> b/gcc/config/aarch64/aarch64-protos.h
>> index 3cdd69b..7ef8cdf 100644
>> --- a/gcc/config/aarch64/aarch64-protos.h
>> +++ b/gcc/config/aarch64/aarch64-protos.h
>> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params;
>> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned,
>> unsigned); int aarch64_get_condition_code (rtx); bool
>> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
>> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT
>> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2
>> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned
>> +HOST_WIDE_INT val_in, machine_mode mode);
>> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type
>> aarch64_classify_symbolic_expression (rtx); bool
>> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git
>> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index
>> 3e663eb..8e33c40 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, machine_mode mode)
>> return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26];
>> }
>>
>> +/* Create mask of ones, covering the lowest to highest bits set in
>> +VAL_IN. */
>> +
>> +unsigned HOST_WIDE_INT
>> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) {
>> + int lowest_bit_set = ctz_hwi (val_in);
>> + int highest_bit_set = floor_log2 (val_in);
>> + gcc_assert (val_in != 0);
>
> The comment above the function should make this precondition clear. Or
> you could pick a well defined behaviour for when val_in == 0 (return
> 0?), if that fits the rest of the algorithm.
>
> Otherwise, this patch looks OK to me.
>
> Thanks,
> James
>
This patch (r242739) causes a regression on aarch64:
gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails.
Christophe.
More information about the Gcc-patches
mailing list