[PATCH, PR43864] Gimple level duplicate block cleanup.

Tom de Vries vries@codesourcery.com
Wed Jun 8 09:49:00 GMT 2011

Hi Richard,

I have a patch for PR43864. The patch adds a gimple level duplicate block
cleanup. The patch has been bootstrapped and reg-tested on x86_64, and
reg-tested on ARM. The size impact on ARM for spec2000 is shown in the following
table (%, lower is better).

                     none            pic
                thumb1  thumb2  thumb1 thumb2
spec2000          99.9    99.9    99.8   99.8

PR43864 is currently marked as a duplicate of PR20070, but I'm not sure that the
optimizations proposed in PR20070 would fix this PR.

The problem in this PR is that when compiling with -O2, the example below should
only have one call to free. The original problem is formulated in terms of -Os,
but currently we generate one call to free with -Os, although still not the
smallest code possible. I'll show here the -O2 case, since that's similar to the
original PR.

#include <stdio.h>
void foo (char*, FILE*);
char* hprofStartupp(char *outputFileName, char *ctx)
    char fileName[1000];
    FILE *fp;
    sprintf(fileName, outputFileName);
    if (access(fileName, 1) == 0) {
        return 0;

    fp = fopen(fileName, 0);
    if (fp == 0) {
        return 0;

    foo(outputFileName, fp);

    return ctx;

AFAIU, there are 2 complementary methods of rtl optimizations proposed in PR20070.
- Merging 2 blocks which are identical expect for input registers, by using a
  conditional move to choose between the different input registers.
- Merging 2 blocks which have different local registers, by ignoring those

Blocks .L6 and.L7 have no difference in local registers, but they have a
difference in input registers: r3 and r1. Replacing the move to r5 by a
conditional move would probably be benificial in terms of size, but it's not
clear what condition the conditional move should be using. Calculating such a
condition would add in size and increase the execution path.

gcc -O2 -march=armv7-a -mthumb pr43864.c -S:
	push	{r4, r5, lr}
	mov	r4, r0
	sub	sp, sp, #1004
	mov	r5, r1
	mov	r0, sp
	mov	r1, r4
	bl	sprintf
	mov	r0, sp
	movs	r1, #1
	bl	access
	mov	r3, r0
	cbz	r0, .L6
	movs	r1, #0
	mov	r0, sp
	bl	fopen
	mov	r1, r0
	cbz	r0, .L7
	mov	r0, r4
	bl	foo
	mov	r0, r5
	add	sp, sp, #1004
	pop	{r4, r5, pc}
	mov	r0, r5
	mov	r5, r3
	bl	free
	b	.L3
	mov	r0, r5
	mov	r5, r1
	bl	free
	b	.L3

The proposed patch solved the problem by dealing with the 2 blocks at a level
when they are still identical: at gimple level. It detect that the 2 blocks are
identical, and removes one of them.

The following table shows the impact of the patch on the example in terms of
size for -march=armv7-a:

          without     with    delta
Os      :     108      104       -4
O2      :     120      104      -16
Os thumb:      68       64       -4
O2 thumb:      76       64      -12

The gain in size for -O2 is that of removing the entire block, plus the
replacement of 2 moves by a constant set, which also decreases the execution
path. The patch ensures optimal code for both -O2 and -Os.

By keeping track of equivalent definitions in the 2 blocks, we can ignore those
differences in comparison. Without this feature, we would only match blocks with
resultless operations, due the the ssa-nature of gimples.
For example, with this feature, we reduce the following function to its minimum
at gimple level, rather than at rtl level.

int f(int c, int b, int d)
  int r, e;

  if (c)
    r = b + d;
      e = b + d;
      r = e;

  return r;

;; Function f (f)

f (int c, int b, int d)
  int e;

<bb 2>:
  e_6 = b_3(D) + d_4(D);
  return e_6;


I'll send the patch with the testcases in a separate email.

OK for trunk?

- Tom

2011-06-08  Tom de Vries  <tom@codesourcery.com>

	PR middle-end/43864
	* tree-cfgcleanup.c (int_int_splay_lookup, int_int_splay_insert)
	(int_int_splay_node_contained_in, int_int_splay_contained_in)
	(equiv_lookup, equiv_insert, equiv_contained_in, equiv_init)
	(equiv_delete, gimple_base_equal_p, pt_solution_equal_p, gimple_equal_p)
	(bb_gimple_equal_p, update_debug_stmts, cleanup_duplicate_preds_1)
	(same_or_local_phi_alternatives, cleanup_duplicate_preds): New function.
	(cleanup_tree_cfg_bb): Use cleanup_duplicate_preds.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr43864.5.patch
Type: text/x-patch
Size: 16000 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20110608/2a56296d/attachment.bin>

More information about the Gcc-patches mailing list