This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Properly check the end of basic block


On Wed, Nov 17, 2010 at 2:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Nov 16, 2010 at 11:51 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Wed, Nov 17, 2010 at 7:44 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>
>>> insn != BB_END (bb) && NEXT_INSN (insn) == NEXT_INSN (BB_END (bb))
>>>
>>> We should check NEXT_INSN (insn) != NEXT_INSN (BB_END (bb)) in
>>> move_or_delete_vzeroupper_2. ?This patch does it.
>>
>> Huh? The loop does simple linear scan of all insns in the bb, so it
>> can't miss BB_END. IIUC, in your case the bb does not have BB_END
>> (bb), but it has NEXT_INSN (BB_END (bb))?
>
> It has BB_END, but it won't be visited by NEXT_INSN starting from
> BB_HEAD. insn != NEXT_INSN (BB_END (bb)) is used to check the
> end of the BB everywhere in gcc.
>
>> Can you please provide a test case that illustrates this?
>>
>
> I am enclosing a work in progress. ?We noticed that we are
> missing a few vzerouppers at -O3 on SPEC CPU 2K/2006.
> One isssue is we may have
>
> foo:
>
> ? ? ? call bar <<<<< Missing vzeroupper
>
> ? ? ? 256bit vectorized insn
> ? ? ? goto foo
>
> We miss vzeroupper before call bar. ?We don't have a small testcase.
> But this patch fixes this case by inspection. We are checking other
> cases.

@@ -118,9 +118,12 @@ move_or_delete_vzeroupper_2 (basic_block bb, bool
upper_128bits_set)
 	     bb->index, upper_128bits_set);

   insn = BB_HEAD (bb);
+  last = NEXT_INSN (BB_END (bb));
   while (insn != BB_END (bb))
     {
       insn = NEXT_INSN (insn);
+      if (insn == last)
+	break;

       if (!NONDEBUG_INSN_P (insn))
 	continue;

The change above is not needed. The new check is never triggered - the
loop terminates when "insn == BB_END (bb)" at "while", so I fail to
see why additional termination for "NEXT_INSN (insn) == NEXT_INSN
(BB_END (bb))" is needed.

(The BB_HEAD (bb) is either a NOTE or CODE_LABEL so it can be skipped
with NEXT_INSN.)

@@ -10970,7 +10973,7 @@ ix86_expand_epilogue (int style)

   /* Emit vzeroupper if needed.  */
   if (TARGET_VZEROUPPER
-      && cfun->machine->use_avx256_p
+      && (cfun->machine->use_avx256_p || flag_tree_vectorize)
       && !cfun->machine->caller_return_avx256_p)
     {
       cfun->machine->use_vzeroupper_p = 1;
@@ -21661,7 +21664,8 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
     }

   /* Add UNSPEC_CALL_NEEDS_VZEROUPPER decoration.  */
-  if (TARGET_VZEROUPPER && cfun->machine->use_avx256_p)
+  if (TARGET_VZEROUPPER
+      && (cfun->machine->use_avx256_p || flag_tree_vectorize))

Decorate *ALL* calls with CALL_NEEDS_VZEROUPPER with
-ftree-vectorize?! It looks that parts (or state machine) that set
...->use_avx256_p flag should be fixed.

     {
       rtx unspec;
       int avx256;
diff --git a/gcc/testsuite/gcc.target/i386/avx-vzeroupper-20.c
b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-20.c
new file mode 100644
index 0000000..3301083
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-20.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx -mtune=generic -dp" } */
+
+extern void free (void *);
+void
+bar (void *ncstrp)
+{
+  if(ncstrp==((void *)0))
+    return;
+  free(ncstrp);
+}
+
+/* { dg-final { scan-assembler-not "avx_vzeroupper" } } */

Hm, this testcase doesn't go together with the above change. There is
no vectorization involved, and the scan checks that vzeroupper is NOT
emitted.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]