[PATCH, rs6000] Map dcbtst, dcbtt to n2=0 for __builtin_prefetch builtin.

Carl Love cel@us.ibm.com
Mon May 7 20:36:00 GMT 2018


GCC maintainers:

The architecture independent builtin __builtin_prefetch() is defined
as:

void __builtin_prefetch (const void *addr, int n1, int n2)
 
n1 -
prefetch read = 0, prefetch write = 1
n2 - temporal locality 0 to 3.  No
temporal locality = 0, high temporal
     locality = 3.

The implementation for Power maps to define_insn "prefetch" in
gcc/config/rs6000/rs6000.md.  The Power implementation currently
ignores the value of n2 and simply generates the dcbtst and dbct
instructions. 

This patch maps n2=0 to generate the dcbtstt mnemonic (dcbst for TH
value of 0b10000) for a write prefetch and dcbtst for n2 in range
[1,3].  

The dcbtt mnemonic (dcbt for TH value of 0b10000) is generated for a
read prefetch when n2=0 and the dbct instruction is generated for n2 in
range [1,3].

The ISA states that the value TH = 0b10000 is a hint that the processor
will probably soon perform a load from the addressed block. 
     
There is an existing test case in
gcc/testsuite/gcc.target/sh/prefetch.dump.  The test case generates the
following output with the patch:

gcc -g -c -o prefetch prefetch.c
objdump -S -d prefetch > prefetch.dump
more prefetch.dump

...
  __builtin_prefetch (&data[0], 0, 0);                                          
   c:   2c 00 3f 39     addi    r9,r31,44                                       
  10:   2c 4a 00 7e     dcbtt   0,r9                                            
  __builtin_prefetch (&data[0], 0, 1);                                          
  14:   2c 00 3f 39     addi    r9,r31,44                                       
  18:   2c 4a 00 7c     dcbt    0,r9                                            
  __builtin_prefetch (&data[0], 0, 2);                                          
  1c:   2c 00 3f 39     addi    r9,r31,44                                       
  20:   2c 4a 00 7c     dcbt    0,r9                                            
  __builtin_prefetch (&data[0], 0, 3);                                          
  24:   2c 00 3f 39     addi    r9,r31,44                                       
  28:   2c 4a 00 7c     dcbt    0,r9                                            
  __builtin_prefetch (&data[0], 1, 0);                                          
  2c:   2c 00 3f 39     addi    r9,r31,44                                       
  30:   ec 49 00 7e     dcbtstt 0,r9                                            
  __builtin_prefetch (&data[0], 1, 1);                                          
  34:   2c 00 3f 39     addi    r9,r31,44                                       
  38:   ec 49 00 7c     dcbtst  0,r9                                            
  __builtin_prefetch (&data[0], 1, 2);                                          
  3c:   2c 00 3f 39     addi    r9,r31,44                                       
  40:   ec 49 00 7c     dcbtst  0,r9                                            
  __builtin_prefetch (&data[0], 1, 3);                                          
  44:   2c 00 3f 39     addi    r9,r31,44                                       
  48:   ec 49 00 7c     dcbtst  0,r9 
...

The regression testing of the patch was done on 

   powerpc64le-unknown-linux-gnu (Power 8 LE)

with no regressions.  

Please let me know if the patch looks OK for GCC mainline.

                         Carl Love
----------------------------------------------------------------

    gcc/ChangeLog:

    2018-05-07  Carl Love  <cel@us.ibm.com>

        * config/rs6000/rs6000.md: Add dcbtst, dcbtt instruction generation
	to define_insn prefetch.
---
 gcc/config/rs6000/rs6000.md | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2b15cca..7429d33 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13233,10 +13233,19 @@
 	     (match_operand:SI 2 "const_int_operand" "n"))]
   ""
 {
-  if (GET_CODE (operands[0]) == REG)
-    return INTVAL (operands[1]) ? "dcbtst 0,%0" : "dcbt 0,%0";
-  return INTVAL (operands[1]) ? "dcbtst %a0" : "dcbt %a0";
-}
+  if (GET_CODE (operands[0]) == REG) {
+    if (INTVAL (operands[1]) == 0)
+      return INTVAL (operands[2]) ? "dcbt 0,%0" : "dcbtt 0,%0";
+    else
+      return INTVAL (operands[2]) ? "dcbtst 0,%0" : "dcbtstt 0,%0";
+
+  } else {
+    if (INTVAL (operands[1]) == 0)
+      return INTVAL (operands[2]) ? "dcbt %a0" : "dcbtt %a0";
+    else
+      return INTVAL (operands[2]) ? "dcbtst %a0" : "dcbtstt %a0";
+  }
+ }
   [(set_attr "type" "load")])
 

 ;; Handle -fsplit-stack.
-- 
2.7.4



More information about the Gcc-patches mailing list