Bug 43864 - Same basic blocks should be merged
Summary: Same basic blocks should be merged
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Tom de Vries
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-23 07:27 UTC by Carrot
Modified: 2011-09-27 19:26 UTC (History)
6 users (show)

See Also:
Host: i686-linux
Target: arm-eabi
Build: i686-linux
Known to work:
Known to fail:
Last reconfirmed:


Attachments
test case (209 bytes, text/x-csrc)
2010-04-23 07:28 UTC, Carrot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carrot 2010-04-23 07:27:48 UTC
Following is a simple example:

#include <stdio.h>
void foo (char*, FILE*);
char* hprofStartupp(char *outputFileName, char *ctx)
{
    char fileName[1000];
    FILE *fp;
    sprintf(fileName, outputFileName);
    if (access(fileName, 1) == 0) {
        free(ctx);               //A
        return 0;                //A
    }

    fp = fopen(fileName, 0);
    if (fp == 0) {
        free(ctx);               //B
        return 0;                //B
    }

    foo(outputFileName, fp);

    return ctx;
}

Basic block A and basic block B are exact same. They do some cleanup work and exit the function. This pattern is quite common in large software. GCC failed to merge them in this test case. With options -march=armv7-a -mthumb -Os, gcc generates:

        ...
        bl      access
        mov     r7, r0
        cbnz    r0, .L2
        mov     r0, r4    //C
        mov     r4, r7    //C
        bl      free      //C
        b       .L3       //C
.L2:
        mov     r0, sp
        movs    r1, #0
        bl      fopen
        mov     r5, r0
        cbnz    r0, .L4
        mov     r0, r4    //D
        mov     r4, r5    //D
        bl      free      //D
        b       .L3       //D
.L4:
        ...

Here basic block C corresponds to basic block A, and basic block D corresponds basic block B. They can be merged as following:

        ...
        bl      access
        mov     r5, r0
        cbz     r0, .L9
.L2:
        mov     r0, sp
        movs    r1, #0
        bl      fopen
        mov     r5, r0
        cbnz    r0, .L4
.L9:
        mov     r0, r4    
        mov     r4, r5   
        bl      free     
        b       .L3      
.L4:
        ...

The rtl cfg optimization can do some merge optimization. But it often fails due to difference of register numbers. We may need to do this at tree level.
Comment 1 Carrot 2010-04-23 07:28:34 UTC
Created attachment 20469 [details]
test case
Comment 2 Steven Bosscher 2010-04-23 07:38:37 UTC

*** This bug has been marked as a duplicate of 20070 ***
Comment 3 Tom de Vries 2011-09-27 16:10:53 UTC
Author: vries
Date: Tue Sep 27 16:10:42 2011
New Revision: 179275

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179275
Log:
2011-09-27  Tom de Vries  <tom@codesourcery.com>

	PR middle-end/43864
	* tree-ssa-tail-merge.c: New file.
	(struct same_succ_def): Define.
	(same_succ, const_same_succ): New typedef.
	(struct bb_cluster_def): Define.
	(bb_cluster, const_bb_cluster): New typedef.
	(struct aux_bb_info): Define.
	(BB_SIZE, BB_SAME_SUCC, BB_CLUSTER, BB_VOP_AT_EXIT): Define.
	(gvn_uses_equal): New function.
	(same_succ_print, same_succ_print_traverse, update_dep_bb)
	(stmt_update_dep_bb, local_def, same_succ_hash)
	(inverse_flags, same_succ_equal, same_succ_alloc, same_succ_delete)
	(same_succ_reset): New function.
	(same_succ_htab, same_succ_edge_flags)
	(deleted_bbs, deleted_bb_preds): New var.
	(debug_same_succ): New function.
	(worklist): New var.
	(print_worklist, add_to_worklist, find_same_succ_bb, find_same_succ)
	(init_worklist, delete_worklist, delete_basic_block_same_succ)
	(same_succ_flush_bbs, purge_bbs, update_worklist): New function.
	(print_cluster, debug_cluster, update_rep_bb)
	(add_bb_to_cluster, new_cluster, delete_cluster): New function.
	(all_clusters): New var.
	(alloc_cluster_vectors, reset_cluster_vectors, delete_cluster_vectors)
	(merge_clusters, set_cluster): New function.
	(gimple_equal_p, gsi_advance_bw_nondebug_nonlocal, find_duplicate)
	(same_phi_alternatives_1, same_phi_alternatives, bb_has_non_vop_phi)
	(deps_ok_for_redirect_from_bb_to_bb, deps_ok_for_redirect)
	(find_clusters_1, find_clusters): New function.
	(update_vuses, vop_phi, vop_at_entry, replace_block_by): New function.
	(update_bbs): New var.
	(apply_clusters): New function.
	(update_debug_stmt, update_debug_stmts): New function.
	(tail_merge_optimize): New function.
	tree-pass.h (tail_merge_optimize): Declare.
	* tree-ssa-pre.c (execute_pre): Use tail_merge_optimize.
	* Makefile.in (OBJS-common): Add tree-ssa-tail-merge.o.
	(tree-ssa-tail-merge.o): New rule.
	* opts.c (default_options_table): Set OPT_ftree_tail_merge by default at
	OPT_LEVELS_2_PLUS.
	* tree-ssa-sccvn.c (vn_valueize): Move to ...
	* tree-ssa-sccvn.h (vn_valueize): Here.
	* timevar.def (TV_TREE_TAIL_MERGE): New timevar.
	* common.opt (ftree-tail-merge): New switch.
	* params.def (PARAM_MAX_TAIL_MERGE_COMPARISONS)
	(PARAM_MAX_TAIL_MERGE_ITERATIONS): New parameter.
	* doc/invoke.texi (Optimization Options, -O2): Add -ftree-tail-merge.
	(-ftree-tail-merge, max-tail-merge-comparisons)
	(max-tail-merge-iterations): New item.

Added:
    trunk/gcc/tree-ssa-tail-merge.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/common.opt
    trunk/gcc/doc/invoke.texi
    trunk/gcc/opts.c
    trunk/gcc/params.def
    trunk/gcc/timevar.def
    trunk/gcc/tree-pass.h
    trunk/gcc/tree-ssa-pre.c
    trunk/gcc/tree-ssa-sccvn.c
    trunk/gcc/tree-ssa-sccvn.h
Comment 4 Tom de Vries 2011-09-27 16:12:48 UTC
Author: vries
Date: Tue Sep 27 16:12:35 2011
New Revision: 179276

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179276
Log:
2011-09-27  Tom de Vries  <tom@codesourcery.com>

	PR middle-end/43864
	* gcc.dg/fold-compare-2.c (dg-options): Add -fno-tree-tail-merge.
	* gcc/testsuite/gcc.dg/uninit-pred-2_c.c: Same.
	* gcc.dg/pr43864.c: New test.
	* gcc.dg/pr43864-2.c: Same.
	* gcc.dg/pr43864-3.c: Same.
	* gcc.dg/pr43864-4.c: Same.

Added:
    trunk/gcc/testsuite/gcc.dg/pr43864-2.c
    trunk/gcc/testsuite/gcc.dg/pr43864-3.c
    trunk/gcc/testsuite/gcc.dg/pr43864-4.c
    trunk/gcc/testsuite/gcc.dg/pr43864.c
Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/fold-compare-2.c
    trunk/gcc/testsuite/gcc.dg/uninit-pred-2_c.c
Comment 5 Tom de Vries 2011-09-27 19:26:44 UTC
Patch and test-cases checked in, marking fixed.