This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Loop induction variable optimization question

From: "Steve Ellcey " <sellcey at mips dot com>
To: <gcc at gcc dot gnu dot org>
Date: Mon, 17 Jun 2013 10:07:41 -0700
Subject: Loop induction variable optimization question

I have a loop induction variable question involving post increment.
If I have this loop:

void *memcpy_word_ptr(int * __restrict d, int * __restrict s, unsigned int n )
{
  int i;
  for(i=0; i<n; i++) {*d++ = *s++; }
  return d;
}

and compile it with: -O3 -fno-tree-loop-distribute-patterns,
the loop induction variable pass (ivopts) converts this loop:

  <bb 4>:
  # d_22 = PHI <d_10(6), d_5(D)(3)>
  # s_23 = PHI <s_11(6), s_6(D)(3)>
  # i_24 = PHI <i_14(6), 0(3)>
  d_10 = d_22 + 4;
  s_11 = s_23 + 4;
  _12 = *s_23;
  *d_22 = _12;
  i_14 = i_24 + 1;
  i.2_8 = (unsigned int) i_14;
  if (i.2_8 < n_9(D))
    goto <bb 6>;  # bb 6 just loops back to bb 4
  else
    goto <bb 5>;

into this loop (using -4 offsets to compensate for incrementing the 's' and 'd'
variables before their use:

  <bb 4>:
  # d_22 = PHI <d_10(6), d_5(D)(3)>
  # s_23 = PHI <s_11(6), s_6(D)(3)>
  # i_24 = PHI <i_14(6), 0(3)>
  d_10 = d_22 + 4;
  s_11 = s_23 + 4;
  _12 = MEM[base: s_11, offset: 4294967292B];
  MEM[base: d_10, offset: 4294967292B] = _12;
  i_14 = i_24 + 1;
  if (i_14 != _2)
    goto <bb 6>;  # bb 6 just loops back to bb 4
  else
    goto <bb 5>;


But if I increment s and d by hand after the copy like this:

void *memcpy_word_ptr(int * __restrict d, int * __restrict s, unsigned int n )
{
  int i;
  for(i=0; i<n; i++) {*d = *s; d++; s++; }
  return d;
}

Then ivopts converts this loop:

  <bb 4>:
  # d_22 = PHI <d_12(6), d_5(D)(3)>
  # s_23 = PHI <s_13(6), s_6(D)(3)>
  # i_24 = PHI <i_14(6), 0(3)>
  _10 = *s_23;
  *d_22 = _10;
  d_12 = d_22 + 4;
  s_13 = s_23 + 4;
  i_14 = i_24 + 1;
  i.0_8 = (unsigned int) i_14;
  if (i.0_8 < n_9(D))
    goto <bb 6>;  # bb 6 just loops back to bb 4
  else
    goto <bb 5>;

into this loop (with 0 offsets):

  <bb 4>:
  # d_22 = PHI <d_12(6), d_5(D)(3)>
  # s_23 = PHI <s_13(6), s_6(D)(3)>
  # i_24 = PHI <i_14(6), 0(3)>
  _10 = MEM[base: s_23, offset: 0B];
  MEM[base: d_22, offset: 0B] = _10;
  d_12 = d_22 + 4;
  s_13 = s_23 + 4;
  i_14 = i_24 + 1;
  if (i_14 != _2)
    goto <bb 6>;  # bb 6 just loops back to bb 4
  else
    goto <bb 5>;


My question is is: why (and where) did ivopts decide to move the
post-increments above the usages in the first loop?  In my case
(MIPS) the second loop generates better code for me then the first
loop and I would like to avoid the '-4' offsets that are used.
Ideally, one would think that GCC should generate the same code
for both of these loops but it does not.

Steve Ellcey
sellcey@mips.com

Follow-Ups:
- Re: Loop induction variable optimization question
  - From: Oleg Endo

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]