[GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks
Sudakshina Das
sudi.das@arm.com
Mon Jul 16 10:55:00 GMT 2018
Hi Sam
On 13/07/18 17:09, Sam Tebbs wrote:
> Hi all,
>
> This patch adds an optimisation that exploits the AArch64 BFXIL instruction
> when or-ing the result of two bitwise and operations with non-overlapping
> bitmasks (e.g. (a & 0xFFFF0000) | (b & 0x0000FFFF)).
>
> Example:
>
> unsigned long long combine(unsigned long long a, unsigned long long b) {
> Â return (a & 0xffffffff00000000ll) | (b & 0x00000000ffffffffll);
> }
>
> void read2(unsigned long long a, unsigned long long b, unsigned long long *c,
> Â unsigned long long *d) {
> Â *c = combine(a, b); *d = combine(b, a);
> }
>
> When compiled with -O2, read2 would result in:
>
> read2:
>  and  x5, x1, #0xffffffff
>  and  x4, x0, #0xffffffff00000000
>  orr  x4, x4, x5
>  and  x1, x1, #0xffffffff00000000
>  and  x0, x0, #0xffffffff
>  str  x4, [x2]
>  orr  x0, x0, x1
>  str  x0, [x3]
> Â ret
>
> But with this patch results in:
>
> read2:
>  mov  x4, x1
> Â bfxil x4, x0, 0, 32
>  str  x4, [x2]
> Â bfxil x0, x1, 0, 32
>  str  x0, [x3]
> Â ret
>
> Bootstrapped and regtested on aarch64-none-linux-gnu and aarch64-none-elf with no regressions.
>
I am not a maintainer but I have a question about this patch. I may be
missing something or reading
it wrong. So feel free to point it out:
+(define_insn "*aarch64_bfxil"
+Â [(set (match_operand:DI 0 "register_operand" "=r")
+Â Â Â (ior:DI (and:DI (match_operand:DI 1 "register_operand" "r")
+Â Â Â Â Â Â Â Â Â (match_operand 3 "const_int_operand"))
+Â Â Â Â Â Â (and:DI (match_operand:DI 2 "register_operand" "0")
+Â Â Â Â Â Â Â Â Â (match_operand 4 "const_int_operand"))))]
+Â "INTVAL (operands[3]) == ~INTVAL (operands[4])
+Â Â Â && aarch64_is_left_consecutive (INTVAL (operands[3]))"
+Â {
+Â Â Â HOST_WIDE_INT op4 = INTVAL (operands[4]);
+Â Â Â operands[3] = GEN_INT (64 - ceil_log2 (op4));
+Â Â Â output_asm_insn ("bfxil\\t%0, %1, 0, %3", operands);
In the BFXIL you are reading %3 LSB bits from operand 1 and putting it
in the LSBs of %0.
This means that the pattern should be masking the 32-%3 MSB of %0 and
%3 LSB of %1. So shouldn't operand 4 is LEFT_CONSECUTIVE>
Can you please compare a simpler version of the above example you gave to
make sure the generated assembly is equivalent before and after the patch:
void read2(unsigned long long a, unsigned long long b, unsigned long long *c) {
 *c = combine(a, b);
}
From the above text
read2:
 and  x5, x1, #0xffffffff
 and  x4, x0, #0xffffffff00000000
 orr  x4, x4, x5
read2:
 mov  x4, x1
 bfxil x4, x0, 0, 32
This does not seem equivalent to me.
Thanks
Sudi
+Â Â Â return "";
+Â }
+Â [(set_attr "type" "bfx")]
+)
> gcc/
> 2018-07-11 Sam Tebbs <sam.tebbs@arm.com>
>
> Â Â Â Â Â Â Â * config/aarch64/aarch64.md (*aarch64_bfxil, *aarch64_bfxil_alt):
> Â Â Â Â Â Â Â Define.
> Â Â Â Â Â Â Â * config/aarch64/aarch64-protos.h (aarch64_is_left_consecutive):
> Â Â Â Â Â Â Â Define.
> Â Â Â Â Â Â Â * config/aarch64/aarch64.c (aarch64_is_left_consecutive): New function.
>
> gcc/testsuite
> 2018-07-11 Sam Tebbs <sam.tebbs@arm.com>
>
> Â Â Â Â Â Â Â * gcc.target/aarch64/combine_bfxil.c: New file.
> Â Â Â Â Â Â Â * gcc.target/aarch64/combine_bfxil_2.c: New file.
>
>
>
More information about the Gcc-patches
mailing list