This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: The arm patch 161344 to transform TST into LSLS


On 02/07/10 23:06, Carrot Wei wrote:
Hi Richard

The following patch has been tested on arm qemu. Could you take a look?

thanks
Guozhi

ChangeLog:
2010-07-02  Wei Guozhi<carrot@google.com>

         * thumb2.md (peephole2 to convert zero_extract/compare of lowest bits
         to lshift/compare): New.


This is basically fine. But one minor nit.


A lsl ...,#0 in Thumb1 isn't a left shift at all, but a movs (and a picky assembler might reject the construct, as some versions of the ARM ARM state that the range of the shift should be in the range 1...31). Fortunately, I doubt this pattern would ever be generated for that case, since a zero_extract of all 32-bits of an SImode value would simplify into the original operand.

So please change the limit on op2 to be less than 32.

OK with that change.

R.

Index: thumb2.md =================================================================== --- thumb2.md ïrevision 161725ï +++ thumb2.md ïworking copyï @@ -1501,3 +1501,29 @@ VOIDmode, operands[0], const0_rtx); ")

+(define_peephole2
+  [(set (match_operand:CC_NOOV 0 "cc_register" "")
+       (compare:CC_NOOV (zero_extract:SI
+                         (match_operand:SI 1 "low_register_operand" "")
+                         (match_operand:SI 2 "const_int_operand" "")
+                         (const_int 0))
+                        (const_int 0)))
+   (match_scratch:SI 3 "l")
+   (set (pc)
+       (if_then_else (match_operator:CC_NOOV 4 "equality_operator"
+                      [(match_dup 0) (const_int 0)])
+                     (match_operand 5 "" "")
+                     (match_operand 6 "" "")))]
+  "TARGET_THUMB2
+&&  (INTVAL (operands[2])>  0&&  INTVAL (operands[2])<= 32)"
+  [(parallel [(set (match_dup 0)
+                  (compare:CC_NOOV (ashift:SI (match_dup 1) (match_dup 2))
+                                   (const_int 0)))
+             (clobber (match_dup 3))])
+   (set (pc)
+       (if_then_else (match_op_dup 4 [(match_dup 0) (const_int 0)])
+                     (match_dup 5) (match_dup 6)))]
+  "
+  operands[2] = GEN_INT (32 - INTVAL (operands[2]));
+  ")
+

On Fri, Jul 2, 2010 at 5:15 PM, Richard Earnshaw<rearnsha@arm.com> wrote:

On Fri, 2010-07-02 at 08:53 +0800, Carrot Wei wrote:
Hi Richard

The new peephole2 and the old pattern does different optimization. As
you have described the peephole2 can optimize the cases that test a
single bit in a word. But the old pattern tests if the bit fields at
the low end of a word is equal or not equal to zero, the bit field may
contain more than 1 bit. Interestingly the test case with the old
pattern can fit in both situations. If we change the test case as
following, it can show the regression.

struct A
{
   int v:2;
};


int bar(); int foo(struct A* p) { if (p->v) return 1; return bar(); }

So we need another peephole2 to bring that optimization back.

thanks
Guozhi

Yes, a peep2 for that should be pretty straight-forward to generate. Simply transform the code into a left-shift and compare with 0, then a branch if eq/ne.

R.








Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]