This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] final-value replacement from DCE
On Wed, 29 May 2019, Jakub Jelinek wrote:
> On Wed, May 29, 2019 at 09:57:50AM -0600, Jeff Law wrote:
> > > FAIL: gcc.dg/builtin-object-size-1.c execution test
> > > FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort
>
> I admit I haven't looked at the details here, but wonder if the optimization
> couldn't be done only in the DCE passes post IPA, otherwise we risk
> behavior changes for __builtin_object_size.
We can do that - the first CD-DCE pass is in the loop pipeline though,
_after_ final value replacement. Looking at the testsuite fallout
it's also clear that doing loop-header copying before final-value
replacement results in better code for some testcases.
So I'm trying turning the first DCE after loop-header copying into
a CD-DCE run, not doing final value replacement before IPA.
The following does that independently, bootstrapped & tested
on x86_64-unknown-linux-gnu. It will leave
FAIL: gcc.dg/tree-ssa/pr68619-4.c scan-tree-dump optimized "PHI <.*, 39"
because the testcase is totally unclear on who is supposed to
propagate 39 and why. With CD-DCE there's one PRE opportunity
less because, well, a value is no longer partially redundant.
I hope I catched all dce/cddce dump issues and it just seemed to
me that unifying dce and cd-dce may be a useful cleanup
and just have
NEXT_PASS (pass_dce, true /* perform control-dependent DCE */)
but not for today...
Not going to apply this separately but only eventually together
with the rest.
Richard.
2019-05-31 Richard Biener <rguenther@suse.de>
PR tree-optimization/68619
* passes.def (pass_dce after CH): Turn into pass_cd_dce.
* g++.dg/tree-ssa/copyprop-1.C: Adjust dump scanned.
* gcc.dg/tree-ssa/20030709-2.c: Likewise.
* gcc.dg/tree-ssa/20030808-1.c: Likewise.
* gcc.dg/tree-ssa/20040729-1.c: Likewise.
* gcc.dg/tree-ssa/loop-36.c: Likewise.
* gcc.dg/tree-ssa/ssa-dce-1.c: Likewise.
* gcc.dg/tree-ssa/ssa-dce-2.c: Likewise.
Index: gcc/passes.def
===================================================================
--- gcc/passes.def (revision 271802)
+++ gcc/passes.def (working copy)
@@ -231,7 +231,7 @@ along with GCC; see the file COPYING3.
NEXT_PASS (pass_isolate_erroneous_paths);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
- NEXT_PASS (pass_dce);
+ NEXT_PASS (pass_cd_dce);
NEXT_PASS (pass_forwprop);
NEXT_PASS (pass_phiopt, false /* early_p */);
NEXT_PASS (pass_ccp, true /* nonzero_p */);
Index: gcc/testsuite/g++.dg/tree-ssa/copyprop-1.C
===================================================================
--- gcc/testsuite/g++.dg/tree-ssa/copyprop-1.C (revision 271802)
+++ gcc/testsuite/g++.dg/tree-ssa/copyprop-1.C (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-dce3" } */
+/* { dg-options "-O -fdump-tree-cddce2" } */
/* Verify that we can eliminate the useless conversions to/from
const qualified pointer types
@@ -27,4 +27,4 @@ int foo(Object&o)
/* Remaining should be two loads. */
-/* { dg-final { scan-tree-dump-times " = \[^\n\]*;" 2 "dce3" } } */
+/* { dg-final { scan-tree-dump-times " = \[^\n\]*;" 2 "cddce2" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-dce3" } */
+/* { dg-options "-O -fdump-tree-cddce2" } */
struct rtx_def;
typedef struct rtx_def *rtx;
@@ -42,13 +42,13 @@ get_alias_set (t)
/* There should be precisely one load of ->decl.rtl. If there is
more than, then the dominator optimizations failed. */
-/* { dg-final { scan-tree-dump-times "->decl\\.rtl" 1 "dce3"} } */
+/* { dg-final { scan-tree-dump-times "->decl\\.rtl" 1 "cddce2"} } */
/* There should be no loads of .rtmem since the complex return statement
is just "return 0". */
-/* { dg-final { scan-tree-dump-times ".rtmem" 0 "dce3"} } */
+/* { dg-final { scan-tree-dump-times ".rtmem" 0 "cddce2"} } */
/* There should be one IF statement (the complex return statement should
collapse down to a simple return 0 without any conditionals). */
-/* { dg-final { scan-tree-dump-times "if " 1 "dce3"} } */
+/* { dg-final { scan-tree-dump-times "if " 1 "cddce2"} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-cddce3" } */
+/* { dg-options "-O1 -fdump-tree-cddce4" } */
extern void abort (void);
@@ -33,8 +33,8 @@ delete_dead_jumptables ()
/* There should be no loads of ->code. If any exist, then we failed to
optimize away all the IF statements and the statements feeding
their conditions. */
-/* { dg-final { scan-tree-dump-times "->code" 0 "cddce3"} } */
+/* { dg-final { scan-tree-dump-times "->code" 0 "cddce4"} } */
/* There should be no IF statements. */
-/* { dg-final { scan-tree-dump-times "if " 0 "cddce3"} } */
+/* { dg-final { scan-tree-dump-times "if " 0 "cddce4"} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/20040729-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/20040729-1.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/20040729-1.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-dce3" } */
+/* { dg-options "-O1 -fdump-tree-cddce2" } */
int
foo ()
@@ -16,4 +16,4 @@ foo ()
compiler was mistakenly thinking that the statement had volatile
operands. But 'p' itself is not volatile and taking the address of
a volatile does not constitute a volatile operand. */
-/* { dg-final { scan-tree-dump-times "&x" 0 "dce3"} } */
+/* { dg-final { scan-tree-dump-times "&x" 0 "cddce2"} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-36.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-36.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-36.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dce3" } */
+/* { dg-options "-O2 -fdump-tree-cddce2" } */
struct X { float array[2]; };
@@ -18,4 +18,4 @@ float foobar () {
/* The temporary structure should have been promoted to registers
by FRE after the loops have been unrolled by the early unrolling pass. */
-/* { dg-final { scan-tree-dump-not "c\.array" "dce3" } } */
+/* { dg-final { scan-tree-dump-not "c\.array" "cddce2" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-1.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-1.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-dce3" } */
+/* { dg-options "-O1 -fdump-tree-cddce2" } */
int t() __attribute__ ((const));
void
@@ -10,4 +10,4 @@ q()
i = t();
}
/* There should be no IF conditionals. */
-/* { dg-final { scan-tree-dump-times "if " 0 "dce3"} } */
+/* { dg-final { scan-tree-dump-times "if " 0 "cddce2"} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-2.c (revision 271802)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-2.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dce3" } */
+/* { dg-options "-O2 -fdump-tree-cddce2" } */
/* We should notice constantness of this function. */
static int __attribute__((noinline)) t(int a)
@@ -13,4 +13,4 @@ void q(void)
i = t(1);
}
/* There should be no IF conditionals. */
-/* { dg-final { scan-tree-dump-times "if " 0 "dce3"} } */
+/* { dg-final { scan-tree-dump-times "if " 0 "cddce2"} } */