57186 – implement load sinking in loops

Bug 57186 - implement load sinking in loops

Summary: implement load sinking in loops

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.9.0

Importance:	P3 enhancement
Target Milestone:	5.0
Assignee:	Not yet assigned to anyone

URL:	http://gcc.gnu.org/ml/gcc-patches/201...
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2013-05-06 14:50 UTC by Eric Botcazou
Modified:	2021-07-26 22:09 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2013-05-07 00:00:00

Attachments
Original implementation (5.43 KB, patch) 2013-05-06 14:50 UTC, Eric Botcazou	Details \| Diff
Testcase #1 (214 bytes, text/plain) 2013-05-06 14:51 UTC, Eric Botcazou	Details
Testcase #2 (232 bytes, text/plain) 2013-05-06 14:51 UTC, Eric Botcazou	Details
Testcase #3 (240 bytes, text/plain) 2013-05-06 14:52 UTC, Eric Botcazou	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Eric Botcazou 2013-05-06 14:50:44 UTC

Created attachment 30040 [details]
Original implementation

Even at -O3, the compiler doesn't figure out that the following loop is dumb:

#define SIZE 64

int foo (int v[])
{
  int r;

  for (i = 0; i < SIZE; i++)
    r = v[i];

  return r;
}

This isn't entirely unexpected, as it probably matters only for (slightly) pathological cases.  The attached patch nevertheless implements a form of load sinking in loops so as to optimize these cases.  It's combined with invariant motion to optimize:

int foo (int v[], int a)
{
  int r, i;

  for (i = 0; i < SIZE; i++)
    r = v[i] + a;

  return r;
}

and with store sinking to optimize:

int foo (int v1[], int v2[])
{
  int r[SIZE];
  int i, j;

  for (j = 0; j < SIZE; j++)
    for (i = 0; i < SIZE; i++)
      r[j] = v1[j] + v2[i];

  return r[SIZE - 1];
}

The optimization is enabled at -O2 in the patch for measurement purposes but, 
given how rarely it triggers (e.g. exactly 10 occurrences in a GCC bootstrap, 
compiler-only, all languages except Go), it's probably best suited to -O3.

It also comes with 3 testcases.

Comment 1 Eric Botcazou 2013-05-06 14:51:30 UTC

Created attachment 30041 [details]
Testcase #1

Comment 2 Eric Botcazou 2013-05-06 14:51:58 UTC

Created attachment 30042 [details]
Testcase #2

Comment 3 Eric Botcazou 2013-05-06 14:52:25 UTC

Created attachment 30043 [details]
Testcase #3

Comment 4 Richard Biener 2013-05-07 09:02:48 UTC

In theory we have the SCCP patch to address at least case #1.  It looks
at loop exit PHIs and tries to replace them with the overall effect of
the loop (using SCEV analysis, which is why it fails for loads).

  <bb 4>:
  # i_15 = PHI <i_10(3), 0(2)>
  _4 = (long unsigned int) i_15;
  _5 = _4 * 4;
  _7 = v_6(D) + _5;
  r_9 = *_7;
  i_10 = i_15 + 1;
  if (i_10 != 64)
    goto <bb 3>;
  else
    goto <bb 5>;

  <bb 5>:
  # r_20 = PHI <r_9(4)>

the complication with handling the above in existing passes (for example
code sinking) is that the final value of _7 is needed to sink the dependent
computations.  So I think what SCCP should do here is look at the
single-use chain from the defs of r_20 and analyze the evolution of
the dependent loop variant SSA names, performing sinking of the dependencies
as well.  Conveniently LIM runs before SCCP and performs the necessary
store sinking to allow handling of the last case.

Comment 5 Richard Biener 2014-06-10 10:30:16 UTC

Author: rguenth
Date: Tue Jun 10 10:29:44 2014
New Revision: 211404

URL: http://gcc.gnu.org/viewcvs?rev=211404&root=gcc&view=rev
Log:
2014-06-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/57186
	PR tree-optimization/59299
	* gcc.dg/tree-ssa/ssa-sink-11.c: New testcase.
	* gcc.dg/tree-ssa/ssa-sink-12.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-12.c
Modified:
    trunk/gcc/testsuite/ChangeLog

Comment 6 Andrew Pinski 2021-07-26 22:09:33 UTC

Fixed in GCC 5 by r5-1146 .