Bug 58508 - [Missed-Optimization] Redundant vector load of "actual" loop invariant in loop body.
Summary: [Missed-Optimization] Redundant vector load of "actual" loop invariant in loo...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2013-09-23 18:23 UTC by Cong Hou
Modified: 2013-11-11 19:31 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-09-24 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Cong Hou 2013-09-23 18:23:25 UTC
When GCC vectorizes the loop below, it will firstly do loop versioning with aliasing check on a and b. Since a and b have different strides (1 and 0), the check guarantees that there is no aliasing between a and b across all iterations. Then with this precondition *b becomes a loop invariant so that it can be loaded outside the loop during vectorization (Note that this precondition always holds when the loop is being vectorized). This can save us a load and a shuffle instruction in each iteration.


void foo (int* a, int* b, int n)
{
  for (int i = 0; i < n; ++i)
    a[i] += *b;
}


I have a patch handling this case as an optimization. After loop versioning, I detect all zero-strided data references and hoist the loads of them to the loop header. The patch is shown below.


thanks,
Cong



Index: gcc/tree-vect-loop-manip.c
===================================================================
--- gcc/tree-vect-loop-manip.c	(revision 202662)
+++ gcc/tree-vect-loop-manip.c	(working copy)
@@ -2477,6 +2477,37 @@ vect_loop_versioning (loop_vec_info loop
       adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
     }
 
+  /* Extract load and store statements on pointers with zero-stride 
+     accesses.  */
+  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
+    {
+
+      /* In the loop body, we iterate each statement to check if it is a load 
+	 or store. Then we check the DR_STEP of the data reference.  If 
+	 DR_STEP is zero, then we will hoist the load statement to the loop 
+	 preheader, and move the store statement to the loop exit.  */
+
+      for (gimple_stmt_iterator si = gsi_start_bb (loop->header); 
+	    !gsi_end_p (si); )
+	{
+	  gimple stmt = gsi_stmt (si);
+	  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+	  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
+
+	  if (dr && integer_zerop (DR_STEP (dr)))
+	    {
+	      if (DR_IS_READ (dr))
+		{
+		  basic_block preheader = loop_preheader_edge (loop)->src;
+		  gimple_stmt_iterator si_dst = gsi_last_bb (preheader);
+		  gsi_move_after (&si, &si_dst);
+		}
+	    }
+	  else
+	    gsi_next (&si);
+	}
+    } 
+
   /* End loop-exit-fixes after versioning.  */
 
   if (cond_expr_stmt_list)
Comment 1 Richard Biener 2013-09-24 07:47:38 UTC
While the observation is correct, the fix is not.  Please just emit the
load on the preheader edge, like we do for other dt_external vectors we
materialize.
Comment 2 Cong Hou 2013-10-15 20:57:27 UTC
Thank you for the comment. I have modified the patch by using 

gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt);

to move the statement. 


I have sent this patch again.


Thank you!


Cong
Comment 3 Jeffrey A. Law 2013-10-19 05:20:26 UTC
Author: law
Date: Sat Oct 19 05:20:24 2013
New Revision: 203842

URL: http://gcc.gnu.org/viewcvs?rev=203842&root=gcc&view=rev
Log:
	PR tree-optimization/58508
	* tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant
	statement that contains data refs with zero-step.

	* gcc.dg/vect/pr58508.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr58508.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-loop-manip.c
Comment 4 Bernd Edlinger 2013-10-27 06:25:12 UTC
Hi,

the test case is failing on a i686-pc-linux-gnu.
Reason: by default the -msse2 is not enabled.
If I add -msse2 to dg_options the test passes.

Regards
Bernd.
Comment 5 Cong Hou 2013-10-29 00:09:20 UTC
I guess I should add 

/* { dg-require-effective-target vect_int } */

to the test case. It is right?
Comment 6 Bernd Edlinger 2013-10-29 13:50:14 UTC
(In reply to Cong Hou from comment #5)
> I guess I should add 
> 
> /* { dg-require-effective-target vect_int } */
> 
> to the test case. It is right?

Yes.
Comment 7 Cong Hou 2013-10-29 17:22:45 UTC
OK. I made a new patch to fix this problem. Waiting to be approved.


thanks,
Cong



diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9d0f4a5..3d9916d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2013-10-29  Cong Hou  <congh@google.com>
+
+       * gcc.dg/vect/pr58508.c: Update.
+
 2013-10-15  Cong Hou  <congh@google.com>

        * gcc.dg/vect/pr58508.c: New test.
diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c
b/gcc/testsuite/gcc.dg/vect/pr58508.c
index 6484a65..fff7a04 100644
--- a/gcc/testsuite/gcc.dg/vect/pr58508.c
+++ b/gcc/testsuite/gcc.dg/vect/pr58508.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target vect_int } */
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */





On Tue, Oct 29, 2013 at 6:50 AM, bernd.edlinger at hotmail dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508
>
> --- Comment #6 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
> (In reply to Cong Hou from comment #5)
>> I guess I should add
>>
>> /* { dg-require-effective-target vect_int } */
>>
>> to the test case. It is right?
>
> Yes.
>
> --
> You are receiving this mail because:
> You reported the bug.
Comment 8 Cong Hou 2013-11-08 18:59:48 UTC
Author: congh
Date: Fri Nov  8 18:44:46 2013
New Revision: 204590

URL: http://gcc.gnu.org/viewcvs?rev=204590&root=gcc&view=rev
Log:
2013-11-08  Cong Hou  <congh@google.com>

	PR tree-optimization/58508
	* gcc.dg/vect/pr58508.c: Update.


Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/pr58508.c
Comment 9 Cong Hou 2013-11-11 19:31:44 UTC
(In reply to congh from comment #8)
> Author: congh
> Date: Fri Nov  8 18:44:46 2013
> New Revision: 204590
> 
> URL: http://gcc.gnu.org/viewcvs?rev=204590&root=gcc&view=rev
> Log:
> 2013-11-08  Cong Hou  <congh@google.com>
> 
> 	PR tree-optimization/58508
> 	* gcc.dg/vect/pr58508.c: Update.
> 
> 
> Modified:
>     trunk/gcc/testsuite/ChangeLog
>     trunk/gcc/testsuite/gcc.dg/vect/pr58508.c