Bug 57186 - implement load sinking in loops
Summary: implement load sinking in loops
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.9.0
: P3 enhancement
Target Milestone: 5.0
Assignee: Not yet assigned to anyone
URL: http://gcc.gnu.org/ml/gcc-patches/201...
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2013-05-06 14:50 UTC by Eric Botcazou
Modified: 2021-07-26 22:09 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-05-07 00:00:00


Attachments
Original implementation (5.43 KB, patch)
2013-05-06 14:50 UTC, Eric Botcazou
Details | Diff
Testcase #1 (214 bytes, text/plain)
2013-05-06 14:51 UTC, Eric Botcazou
Details
Testcase #2 (232 bytes, text/plain)
2013-05-06 14:51 UTC, Eric Botcazou
Details
Testcase #3 (240 bytes, text/plain)
2013-05-06 14:52 UTC, Eric Botcazou
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Botcazou 2013-05-06 14:50:44 UTC
Created attachment 30040 [details]
Original implementation

Even at -O3, the compiler doesn't figure out that the following loop is dumb:

#define SIZE 64

int foo (int v[])
{
  int r;

  for (i = 0; i < SIZE; i++)
    r = v[i];

  return r;
}

This isn't entirely unexpected, as it probably matters only for (slightly) pathological cases.  The attached patch nevertheless implements a form of load sinking in loops so as to optimize these cases.  It's combined with invariant motion to optimize:

int foo (int v[], int a)
{
  int r, i;

  for (i = 0; i < SIZE; i++)
    r = v[i] + a;

  return r;
}

and with store sinking to optimize:

int foo (int v1[], int v2[])
{
  int r[SIZE];
  int i, j;

  for (j = 0; j < SIZE; j++)
    for (i = 0; i < SIZE; i++)
      r[j] = v1[j] + v2[i];

  return r[SIZE - 1];
}

The optimization is enabled at -O2 in the patch for measurement purposes but, 
given how rarely it triggers (e.g. exactly 10 occurrences in a GCC bootstrap, 
compiler-only, all languages except Go), it's probably best suited to -O3.

It also comes with 3 testcases.
Comment 1 Eric Botcazou 2013-05-06 14:51:30 UTC
Created attachment 30041 [details]
Testcase #1
Comment 2 Eric Botcazou 2013-05-06 14:51:58 UTC
Created attachment 30042 [details]
Testcase #2
Comment 3 Eric Botcazou 2013-05-06 14:52:25 UTC
Created attachment 30043 [details]
Testcase #3
Comment 4 Richard Biener 2013-05-07 09:02:48 UTC
In theory we have the SCCP patch to address at least case #1.  It looks
at loop exit PHIs and tries to replace them with the overall effect of
the loop (using SCEV analysis, which is why it fails for loads).

  <bb 4>:
  # i_15 = PHI <i_10(3), 0(2)>
  _4 = (long unsigned int) i_15;
  _5 = _4 * 4;
  _7 = v_6(D) + _5;
  r_9 = *_7;
  i_10 = i_15 + 1;
  if (i_10 != 64)
    goto <bb 3>;
  else
    goto <bb 5>;

  <bb 5>:
  # r_20 = PHI <r_9(4)>

the complication with handling the above in existing passes (for example
code sinking) is that the final value of _7 is needed to sink the dependent
computations.  So I think what SCCP should do here is look at the
single-use chain from the defs of r_20 and analyze the evolution of
the dependent loop variant SSA names, performing sinking of the dependencies
as well.  Conveniently LIM runs before SCCP and performs the necessary
store sinking to allow handling of the last case.
Comment 5 Richard Biener 2014-06-10 10:30:16 UTC
Author: rguenth
Date: Tue Jun 10 10:29:44 2014
New Revision: 211404

URL: http://gcc.gnu.org/viewcvs?rev=211404&root=gcc&view=rev
Log:
2014-06-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/57186
	PR tree-optimization/59299
	* gcc.dg/tree-ssa/ssa-sink-11.c: New testcase.
	* gcc.dg/tree-ssa/ssa-sink-12.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-12.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 6 Andrew Pinski 2021-07-26 22:09:33 UTC
Fixed in GCC 5 by r5-1146 .