This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH][aarch64] Avoid tag collisions for loads on falkor
- From: Siddhesh Poyarekar <siddhesh at gotplt dot org>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, Siddhesh Poyarekar <siddhesh at sourceware dot org>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: James Greenhalgh <James dot Greenhalgh at arm dot com>
- Date: Mon, 2 Jul 2018 16:39:22 +0530
- Subject: Re: [PATCH][aarch64] Avoid tag collisions for loads on falkor
- References: <email@example.com> <5B39F78D.firstname.lastname@example.org>
On 07/02/2018 03:29 PM, Kyrill Tkachov wrote:
Nice! What were the regressions though? Would be nice to adjust the tests
to make them more robust so that we have as clean a testsuite as possible.
Sure, they're gcc.dg/guality/pr36728-2.c and gcc.target/aarch64/extend.c.
The addressing mode costs for falkor lead to generation of an sbfiz +
ldr for extend.c instead of the ldr with sxtw. Luis is looking at
whether that is the best output for falkor or if it needs to be
improved. I suspect this may result in a cost adjustment.
pr36728-2.c reorders code and seems to throw off gdb but the codegen
seems correct. This patch is not responsible for this regression though
(nor extend.c) so I didn't look too far beyond verifying that the
codegen wasn't incorrect.
More comments inline, but a general observation:
in the function comment for the new functions can you please include a
of the function arguments and the meaning of the return value (for
example, some functions return -1 ; what does that mean?).
It really does make it much easier to maintain the code after some time
+ rudimentarny attempt to ensure that related loads with the same
+ get moved out unnecessarily.
+ tag_insn_info (rtx_insn *insn, rtx dest, rtx base, rtx offset,
+ bool writeback, bool ldp)
+ this->insn = insn;
+ this->dest = dest;
+ this->base = base;
+ this->offset = offset;
+ this->writeback = writeback;
+ this->ldp = ldp;
Since this is C++ you can write it as the more idiomatic constructor
initialiser list (I think that's what it's called):
tag_insn_info (rtx_insn *i, rtx b, rtx d, rtx o, bool wr, bool l) : insn
(i), base (b), dest (d) etc.
+ /* Compute the tag based on BASE, DEST and OFFSET of the load. */
+ unsigned tag ()
+ unsigned int_offset = 0;
+ rtx offset = this->offset;
+ unsigned dest = REGNO (this->dest);
+ unsigned base = REGNO (this->base);
+ machine_mode dest_mode = GET_MODE (this->dest);
+ unsigned dest_mode_size = GET_MODE_SIZE (dest_mode).to_constant
I appreciate this pass is unlikely to be used with SVE code but it would
be nice if we could make it
variable-with-mode-proof. Current practice is to add a comment to
.to_constant () calls explaining why
we guarantee that the size is constant, or otherwise check is_constant
() and have appropriate fallbacks.
Check other uses of to_constant () and is_constant () in aarch64.c for
examples. This applies to all uses
of to_constant () in this file.
+ recog_memoized (insn);
Did you mean to continue here if recog_memoized (insn) < 0 ?
I didn't, thanks for catching that.
+ /* Don't bother with very long strides because the
+ is unable to train on them anyway. */
+ if (INTVAL (stride) < 2048)
+ return true;
I appreciate this is a core-specific but can you please at least make it
a #define constant with
a meaningful name and use that?
+ /* The largest width we want to bother with is a load of a pair of