[PATCH] Do not use a REG_EQUAL value for gcse/cprop if SRC is a REG

Steven Bosscher stevenb.gcc@gmail.com
Sun May 25 11:56:00 GMT 2008


On Sun, May 25, 2008 at 11:59 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
>> Bootstrapped&tested on powerpc-linux-gnu (32 and 64 bit).  OK for trunk?
>
> This is ok.

Thanks.  Could you commit it for me?


Of course I have also looked at the remaining things that a second PRE
GCSE pass still catches.  One simple test case (again for powerpc) is
this:

======================
struct cpp_dir
{
  struct cpp_dir *next;
};

static struct cpp_dir *tails[4];

void
add_cpp_dir_path (struct cpp_dir *p, int chain)
{
  if (tails[chain])
    tails[chain]->next = p;
  tails[chain] = p;
}
======================


After the GIMPLE optimizations, the function looks like this (from the
final_cleanup dump):
======================
;; Function add_cpp_dir_path (add_cpp_dir_path)

add_cpp_dir_path (struct cpp_dir * p, int chain)
{
  struct cpp_dir * D.1256;

<bb 2>:
  D.1256 = tails[chain];
  if (D.1256 != 0B)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  D.1256->next = p;

<bb 4>:
  tails[chain] = p;
  return;

}
======================

The store "tails[chain] = p;"  shows the missed optimization.  Before
the first gcse pass, the RTL looks like this:

======================
    5 NOTE_INSN_BASIC_BLOCK
    2 r120:SI=%3:SI
      REG_DEAD: %3:SI
    3 r121:SI=%4:SI
      REG_DEAD: %4:SI
    4 NOTE_INSN_FUNCTION_BEG
    7 r123:SI=high(`*.LANCHOR0')
    8 r122:SI=r123:SI+low(`*.LANCHOR0')
      REG_DEAD: r123:SI
      REG_EQUAL: `*.LANCHOR0'
    9 r124:SI=r121:SI<<0x2
   10 r125:SI=r122:SI+r124:SI
      REG_DEAD: r124:SI
      REG_DEAD: r122:SI
   11 r119:SI=[r125:SI]
      REG_DEAD: r125:SI
   12 r126:CC=cmp(r119:SI,0x0)
   13 pc={(r126:CC==0x0)?L16:pc}
      REG_DEAD: r126:CC
      REG_BR_PROB: 0x3f6
   14 NOTE_INSN_BASIC_BLOCK
   15 [r119:SI]=r120:SI
      REG_DEAD: r119:SI
L16:
   17 NOTE_INSN_BASIC_BLOCK
   18 r128:SI=high(`*.LANCHOR0')
   19 r127:SI=r128:SI+low(`*.LANCHOR0')
      REG_DEAD: r128:SI
      REG_EQUAL: `*.LANCHOR0'
   20 r129:SI=r121:SI<<0x2
      REG_DEAD: r121:SI
   21 r130:SI=r127:SI+r129:SI
      REG_DEAD: r129:SI
      REG_DEAD: r127:SI
   22 [r130:SI]=r120:SI
      REG_DEAD: r130:SI
      REG_DEAD: r120:SI
======================

Insns 18 and 19 are fully redundant.   Note that insn 18 actually
wouldn't have to be eliminated if insn 19 is eliminated.  But for insn
19 PRE GCSE performs PRE on the REG_EQUAL note.  Had insns 18 and 19
been partially redundant, the value of insn 18 would have been
inserted twice (once for insn 18 itself and once when multiple insns
are generated from the REG_EQUAL note on insn 19).  Anyway, insn 19 is
the interesting case, because it exposes the "secondary" redundant
expression for the second PRE GCSE pass.  After the first PRE GCSE
pass, the RTL looks as below. The value of insn 19 is loaded into r133
in insn 8 :

======================
    5 NOTE_INSN_BASIC_BLOCK
    2 r120:SI=%3:SI
      REG_DEAD: %3:SI
    3 r121:SI=%4:SI
      REG_DEAD: %4:SI
    4 NOTE_INSN_FUNCTION_BEG
    7 r131:SI=high(`*.LANCHOR0')
    8 r133:SI=r131:SI+low(`*.LANCHOR0')
      REG_DEAD: r131:SI
      REG_EQUAL: `*.LANCHOR0'
    9 r132:SI=r121:SI<<0x2
      REG_DEAD: r121:SI
   10 r125:SI=r133:SI+r132:SI
   11 r119:SI=[r125:SI]
      REG_DEAD: r125:SI
   12 r126:CC=cmp(r119:SI,0x0)
   13 pc={(r126:CC==0x0)?L16:pc}
      REG_DEAD: r126:CC
      REG_BR_PROB: 0x3f6
   14 NOTE_INSN_BASIC_BLOCK
   15 [r119:SI]=r120:SI
      REG_DEAD: r119:SI
L16:
   17 NOTE_INSN_BASIC_BLOCK
   21 r130:SI=r133:SI+r132:SI
      REG_DEAD: r133:SI
      REG_DEAD: r132:SI
      REG_EQUAL: r132:SI+`*.LANCHOR0'
   22 [r130:SI]=r120:SI
      REG_DEAD: r130:SI
      REG_DEAD: r120:SI
======================

Now, the expression in insn 21 is also fully redundant.  It was
already fully redundant before, from a values point of view, but since
the expressions in insn 10 and insn 21 are not lexically equivalent
before the first PRE GCSE pass, PRE GCSE cannot detect this
redundancy.  That's why I call it a "secondary redundancy": It is only
exposed after eliminating other redundancies.  After the first PRE
GCSE pass and the cleanups that CPROP/CSE perform, the redundancy is
eliminated in the second PRE GCSE pass, resulting in the following
RTL:

======================
    5 NOTE_INSN_BASIC_BLOCK
    2 r120:SI=%3:SI
      REG_DEAD: %3:SI
    3 r121:SI=%4:SI
      REG_DEAD: %4:SI
    4 NOTE_INSN_FUNCTION_BEG
    7 r131:SI=high(`*.LANCHOR0')
    8 r133:SI=r131:SI+low(`*.LANCHOR0')
      REG_DEAD: r131:SI
      REG_EQUAL: `*.LANCHOR0'
    9 r132:SI=r121:SI<<0x2
      REG_DEAD: r121:SI
   10 r134:SI=r133:SI+r132:SI
      REG_DEAD: r133:SI
      REG_DEAD: r132:SI
   11 r119:SI=[r134:SI]
   12 r126:CC=cmp(r119:SI,0x0)
   13 pc={(r126:CC==0x0)?L16:pc}
      REG_DEAD: r126:CC
      REG_BR_PROB: 0x3f6
   14 NOTE_INSN_BASIC_BLOCK
   15 [r119:SI]=r120:SI
      REG_DEAD: r119:SI
L16:
   17 NOTE_INSN_BASIC_BLOCK
   22 [r134:SI]=r120:SI
      REG_DEAD: r134:SI
      REG_DEAD: r120:SI
======================

GCC as-is (unpatched) does not fully optimize the test case.

Can you think of a way to expose this redundancy at the tree level
somehow?  It'd be nice if we could just CSE the address of
"tails[chain]" on GIMPLE...

Gr.
Steven



More information about the Gcc-patches mailing list