This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, libcpp]: Use asm flag outputs in search_line_sse42 main loop


On 06/29/2015 08:07 PM, Uros Bizjak wrote:
Index: lex.c
===================================================================
--- lex.c	(revision 225138)
+++ lex.c	(working copy)
@@ -450,15 +450,30 @@ search_line_sse42 (const uchar *s, const uchar *en
        s = (const uchar *)((si + 16) & -16);
      }

-  /* Main loop, processing 16 bytes at a time.  By doing the whole loop
-     in inline assembly, we can make proper use of the flags set.  */
-  __asm (      "sub $16, %1\n"
-	"	.balign 16\n"
+  /* Main loop, processing 16 bytes at a time.  */
+#ifdef __GCC_ASM_FLAG_OUTPUTS__
+  while (1)
+    {
+      char f;
+      __asm ("%vpcmpestri\t$0, %2, %3"
+	     : "=c"(index), "=@ccc"(f)
+	     : "m"(*s), "x"(search), "a"(4), "d"(16));
+      if (f)
+	break;
+
+      s += 16;
+    }

This change looks good. Modulo keeping a comment mentioning why we can't use the builtin.

+#else
+  s -= 16;
+  /* By doing the whole loop in inline assembly,
+     we can make proper use of the flags set.  */
+  __asm (      ".balign 16\n"
  	"0:	add $16, %1\n"
-	"	%vpcmpestri $0, (%1), %2\n"
+	"	%vpcmpestri\t$0, (%1), %2\n"
  	"	jnc 0b"
  	: "=&c"(index), "+r"(s)
  	: "x"(search), "a"(4), "d"(16));
+#endif

I do wonder about keeping this bit around. Surely we only really care about the performance of search_line after a full bootstrap, at which point we've got the new path.

I think maybe better to adjust the #ifdef HAVE_SSE4 line above to include the G_A_F_O check.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]