[PATCH, rs6000] Map dcbtst, dcbtt to n2=0 for __builtin_prefetch builtin.
Carl Love
cel@us.ibm.com
Mon May 7 20:36:00 GMT 2018
GCC maintainers:
The architecture independent builtin __builtin_prefetch() is defined
as:
void __builtin_prefetch (const void *addr, int n1, int n2)
n1 -
prefetch read = 0, prefetch write = 1
n2 - temporal locality 0 to 3. No
temporal locality = 0, high temporal
locality = 3.
The implementation for Power maps to define_insn "prefetch" in
gcc/config/rs6000/rs6000.md. The Power implementation currently
ignores the value of n2 and simply generates the dcbtst and dbct
instructions.Â
This patch maps n2=0 to generate the dcbtstt mnemonic (dcbst for TH
value of 0b10000) for a write prefetch and dcbtst for n2 in range
[1,3]. Â
The dcbtt mnemonic (dcbt for TH value of 0b10000) is generated for a
read prefetch when n2=0 and the dbct instruction is generated for n2 in
range [1,3].
The ISA states that the value TH = 0b10000 is a hint that the processor
will probably soon perform a load from the addressed block.
There is an existing test case in
gcc/testsuite/gcc.target/sh/prefetch.dump. The test case generates the
following output with the patch:
gcc -g -c -o prefetch prefetch.c
objdump -S -d prefetch > prefetch.dump
more prefetch.dump
...
__builtin_prefetch (&data[0], 0, 0);Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   c:   2c 00 3f 39     addi    r9,r31,44                                      Â
 10:   2c 4a 00 7e     dcbtt   0,r9                                           Â
 __builtin_prefetch (&data[0], 0, 1);                                         Â
 14:   2c 00 3f 39     addi    r9,r31,44                                      Â
 18:   2c 4a 00 7c     dcbt    0,r9                                           Â
 __builtin_prefetch (&data[0], 0, 2);                                         Â
 1c:   2c 00 3f 39     addi    r9,r31,44                                      Â
 20:   2c 4a 00 7c     dcbt    0,r9                                           Â
 __builtin_prefetch (&data[0], 0, 3);                                         Â
 24:   2c 00 3f 39     addi    r9,r31,44                                      Â
 28:   2c 4a 00 7c     dcbt    0,r9                                           Â
 __builtin_prefetch (&data[0], 1, 0);                                         Â
 2c:   2c 00 3f 39     addi    r9,r31,44                                      Â
 30:   ec 49 00 7e     dcbtstt 0,r9                                           Â
 __builtin_prefetch (&data[0], 1, 1);                                         Â
 34:   2c 00 3f 39     addi    r9,r31,44                                      Â
 38:   ec 49 00 7c     dcbtst  0,r9                                           Â
 __builtin_prefetch (&data[0], 1, 2);                                         Â
 3c:   2c 00 3f 39     addi    r9,r31,44                                      Â
 40:   ec 49 00 7c     dcbtst  0,r9                                           Â
 __builtin_prefetch (&data[0], 1, 3);                                         Â
 44:   2c 00 3f 39     addi    r9,r31,44                                      Â
 48:   ec 49 00 7c     dcbtst  0,r9Â
...
The regression testing of the patch was done on
powerpc64le-unknown-linux-gnu (Power 8 LE)
with no regressions.
Please let me know if the patch looks OK for GCC mainline.
                        Carl Love
----------------------------------------------------------------
gcc/ChangeLog:
2018-05-07 Carl Love <cel@us.ibm.com>
* config/rs6000/rs6000.md: Add dcbtst, dcbtt instruction generation
to define_insn prefetch.
---
gcc/config/rs6000/rs6000.md | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2b15cca..7429d33 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13233,10 +13233,19 @@
(match_operand:SI 2 "const_int_operand" "n"))]
""
{
- if (GET_CODE (operands[0]) == REG)
- return INTVAL (operands[1]) ? "dcbtst 0,%0" : "dcbt 0,%0";
- return INTVAL (operands[1]) ? "dcbtst %a0" : "dcbt %a0";
-}
+ if (GET_CODE (operands[0]) == REG) {
+ if (INTVAL (operands[1]) == 0)
+ return INTVAL (operands[2]) ? "dcbt 0,%0" : "dcbtt 0,%0";
+ else
+ return INTVAL (operands[2]) ? "dcbtst 0,%0" : "dcbtstt 0,%0";
+
+ } else {
+ if (INTVAL (operands[1]) == 0)
+ return INTVAL (operands[2]) ? "dcbt %a0" : "dcbtt %a0";
+ else
+ return INTVAL (operands[2]) ? "dcbtst %a0" : "dcbtstt %a0";
+ }
+ }
[(set_attr "type" "load")])
;; Handle -fsplit-stack.
--
2.7.4
More information about the Gcc-patches
mailing list