This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing



So I've gone round and round with BZ39726.

The basic idea here is we have an operation on widened operands that feeds a narrow comparison. ie, the two inputs are QImode, extended to SImode, subtracted, then we want to conditionally branch on the result of sign bit of QImode.

If we could just do the subtraction in the native mode of the operands (QImode) and test that result, we eliminate two extensions and the test itself can be eliminated as we'll get the QImode sign bit from the QImode subtraction.


I first started poking at combine to see if I could exploit the newer 4 insn combination capability to see enough INSNs to prove we could narrow the operations *and* arrange for the resulting insns to be well suited for redundant cmp/tst removal.

This was going to require some extensions to the simplification code, the code to determine good split points, the when to use 4 insn combination heuristics and possibly other stuff. I was able to see all the insns I needed but the RTL I was going to have to detect and rewrite was getting pretty fugly.

I changed course and started looking at twiddling the m68k backend. I built a proof of concept pair of splitters that would rewrite the key 4 insns and it clearly improved the code. But the result was not well suited for removing the redundant tst. To also arrange for result to be suitable for redundant tst/cmp removal, the two proof of concept splitters would have to merely be bridge patterns to an even larger splitter which would give us access to the conditional branch so that we could rewrite the compare and conditional branch too in such a way to make it obvious the result of the subtraction was sufficient to remove the tst/cmp instruction.

I then wondered if I could use the new match.pd code to describe what I wanted at a higher level, possibly getting to simpler code while still working on generic/gimple. Richi speculated that the new match.pd approach could be used for type demotions -- I was skeptical, but what the hell.


+/* If we are testing a single bit resulting from a binary
+   operation in precision P1 where the operands were widened
+   precision P2 and the tested bit is the sign bit for
+   precision P2.  Rewrite so the binary operation is in
+   precision P2.  */
+(for inner_op (minus plus mult bit_and bit_ior bit_xor)
+  (simplify
+    (bit_and (inner_op (convert @0) (convert @1)) INTEGER_CST@3)
+    (if (GENERIC
+ && exact_log2 (TREE_INT_CST_LOW (@3)) + 1 == TYPE_PRECISION (TREE_TYPE (@0))
+         && TREE_TYPE (@0) == TREE_TYPE (@1)
+        && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
+     /* This is a bit hackish, but I didn't see a good way to change
+        the type of the result.  Note that changing the result like
+        this implies that we are not operating on gimple.  */
+     (with {type = TREE_TYPE (@0); }
+      (convert (bit_and (inner_op @0 @1) (convert @3)))))))



The exact_log2 (blah) + 1 == TYPE_PRECISION (blah2) stuff ensures that we're looking at just the sign bit in the same mode/precision as the two operands of the binary operation.

We're changing the type of the entire expression to match the native type of the operands. So we have to reassign "type" to the new type we want to use. This also implies we're working on generic rather than gimple. It's a bit of a hack, but I couldn't convince the code to use a different type for the final result.

I guess it could be made to work on gimple as well by expanding it to include the conditional branch this pattern feeds -- that way we could avoid the type mismatch.

With that match to match.pd, I get the desired code on the m68k for the test in the testcase as well as several of my own. I've also verified this makes _no_ difference to the resulting runtime libraries for x86 and x86_64.

I'll spin a m68k bootstrap over the weekend, but wanted to get some feedback on the match.pd pattern and perhaps a cleaner way to deal with changing the resultant type.
Thoughts, comments?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]