This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PING][PATCH v3] Disable reg offset in quad-word store for Falkor
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "siddhesh at sourceware dot org" <siddhesh at sourceware dot org>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Thu, 15 Feb 2018 14:20:32 +0000
- Subject: Re: [PING][PATCH v3] Disable reg offset in quad-word store for Falkor
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Hi Siddhesh,
I still don't like the idea of disabling a whole class of instructions in the md file.
It seems much better to adjust the costs here so that you get most of the
improvement now, and fine tune it once we can differentiate between
loads and stores.
Taking your example, adding -funroll-loops generates this for Falkor:
ldr q7, [x2, x18]
add x5, x18, 16
add x4, x1, x18
add x10, x18, 32
add x11, x1, x5
add x3, x18, 48
add x12, x1, x10
add x9, x18, 64
add x14, x1, x3
add x8, x18, 80
add x15, x1, x9
add x7, x18, 96
add x16, x1, x8
str q7, [x4]
ldr q16, [x2, x5]
add x6, x18, 112
add x17, x1, x7
add x18, x18, 128
add x5, x1, x6
cmp x18, x13
str q16, [x11]
ldr q17, [x2, x10]
str q17, [x12]
ldr q18, [x2, x3]
str q18, [x14]
ldr q19, [x2, x9]
str q19, [x15]
ldr q20, [x2, x8]
str q20, [x16]
ldr q21, [x2, x7]
str q21, [x17]
ldr q22, [x2, x6]
str q22, [x5]
bne .L25
If you adjust costs however you'd get this:
.L25:
ldr q7, [x14]
add x14, x14, 128
add x4, x4, 128
str q7, [x4, -128]
ldr q16, [x14, -112]
str q16, [x4, -112]
ldr q17, [x14, -96]
str q17, [x4, -96]
ldr q18, [x14, -80]
str q18, [x4, -80]
ldr q19, [x14, -64]
str q19, [x4, -64]
ldr q20, [x14, -48]
str q20, [x4, -48]
ldr q21, [x14, -32]
str q21, [x4, -32]
ldr q22, [x14, -16]
cmp x14, x9
str q22, [x4, -16]
bne .L25
So it seems to me using existing cost mechanisms is always preferable, even if you
currently can't differentiate between loads and stores.
Wilco