This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Redundant load after store elimination in GCSE


Currently the GCSE in gcc doesn't consider a load that follows
a store to the same memory location as being redundant.
This patch adds the ability for GCSE to consider such loads
as redundant and replaces them with register moves
from the corresponding stores. This is implemented in GCSE/PRE
so it applies to both full and partial redundancies.
A new flag '-fgcse-las' is added to gcc to enable/disable
this ability in GCSE. The load-after-store redundancy elimination
is performed in GCSE, so it requires the latter to be enabled.
When GCSE is enabled (-O2 or -fgcse) -fgcse-las is enabled by default.

There is a previous discussion on load-after-store redundancy elimination
in the gcc mailing list: (http://gcc.gnu.org/ml/gcc/2003-05/msg00368.html)

Future enhancement: redundant store elimination - after successfully
eliminating (fully) redundant load-after-store cases, it may be possible
to eliminate the corresponding stores if they become redundant as well.

Testing:
Bootstrap on powerpc-apple-darwin6.4
Regression passed on powerpc-apple-darwin6.4 for:
 gcc, g++, objc, g77, libstdc++
For libjava we got a bit different results when the patch was applied:
two more tests passed, one less was reported unexpectedly failed,
and one less was reported untested.

Testcase:

int y,z;
int f(int a, int b, int c)
{
  int x;
  if (a < 100)
  {
    y = (b + c) ;
  }
  else
  {
    y = (b - c) ;
  }
  if (b < 0)
    x = y * a;
  else
    x = y + a;
  z = x * y;
  return x;
}

In the above testcase, the load from y in calculating z was removed
when the redundant load-after-store elimination (gcse-las) patch was used.
Following is the codegen with and without the gcse-las (on powerpc;
the arrows were added by us to point at load-after-store redundancies):

without redundant load-after-store elimination:
-----------------------------------------------
(using -O3 -fno-gcse-las -mdynamic-no-pic --param max-gcse-passes=3)

_f:
        cmpwi cr0,r3,99
        lis r9,ha16(L_y$non_lazy_ptr)
        subf r2,r5,r4
        bgt- cr0,L2
        cmpwi cr0,r4,0
        lwz r9,lo16(L_y$non_lazy_ptr)(r9)
        add r2,r4,r5
->      stw r2,0(r9)
        blt- cr0,L8
L4:
->      lwz r2,0(r9)
        lis r7,ha16(L_z$non_lazy_ptr)
        lwz r4,lo16(L_z$non_lazy_ptr)(r7)
        add r3,r2,r3
        mullw r6,r3,r2
        stw r6,0(r4)
        blr
L2:
        cmpwi cr0,r4,0
        lis r5,ha16(L_y$non_lazy_ptr)
        lwz r9,lo16(L_y$non_lazy_ptr)(r5)
->      stw r2,0(r9)
        bge+ cr0,L4
L8:
->      lwz r2,0(r9)
        lis r7,ha16(L_z$non_lazy_ptr)
        lwz r4,lo16(L_z$non_lazy_ptr)(r7)
        mullw r3,r2,r3
        mullw r6,r3,r2
        stw r6,0(r4)
        blr

with redundant load-after-store elimination:
--------------------------------------------
(using -O3 -mdynamic-no-pic --param max-gcse-passes=3)

_f:
        cmpwi cr0,r3,99
        mr r9,r3
        lis r2,ha16(L_y$non_lazy_ptr)
        subf r0,r5,r4
        bgt- cr0,L2
        lwz r2,lo16(L_y$non_lazy_ptr)(r2)
        add r0,r4,r5
L6:
        cmpwi cr0,r4,0
        stw r0,0(r2)
        mullw r3,r0,r9
        blt- cr0,L5
        add r3,r0,r9
L5:
        lis r6,ha16(L_z$non_lazy_ptr)
        mullw r5,r3,r0
        lwz r4,lo16(L_z$non_lazy_ptr)(r6)
        stw r5,0(r4)
        blr
L2:
        lis r3,ha16(L_y$non_lazy_ptr) [fully redundant ??]
        lwz r2,lo16(L_y$non_lazy_ptr)(r3)
        b L6


ChangeLog entry
---------------
2003-09-09  Mostafa Hagog  <mustafa@il.ibm.com>

      * common.opt : Add description of the new -fgcse-las flag.

      * flags.h : Declaration of global variable flag_gcse_las.

      * gcse.c:
      (hash_scan_set): Handle the case of store expression and insert
      the memory expression to the hash table, this way we make it possible
      to discover redundant loads after stores and remove them.
      (pre_insert_copy_insn): moved the call to update_ld_motion_stores, to
      pre_insert_copies, it is not the correct place to call it after
adding
      stores to be in the available expression hash table.
      (pre_insert_copies): added the call to update_ld_motion_stores when
      one or more copies were inserted.

      * opts.c:
      (common_handle_option): Handle the -fgcse-las flag.

      * toplev.c: Initialization of flag_gcse_las.


Mostafa.

(See attached file: load_adter_sotre.patch)

Attachment: load_adter_sotre.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]