Bug 42835 - Missed merging common code sequence at the end of two basic blocks
Summary: Missed merging common code sequence at the end of two basic blocks
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.5.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: missed-optimization
Depends on:
Blocks: 16996
  Show dependency treegraph
Reported: 2010-01-21 21:01 UTC by Carrot
Modified: 2011-02-01 01:10 UTC (History)
5 users (show)

See Also:
Target: arm-eabi
Known to work:
Known to fail:
Last reconfirmed: 2010-01-21 21:53:50


Note You need to log in before you can comment on or make changes to this bug.
Description Carrot 2010-01-21 21:01:18 UTC
Compile the following code with options -march=armv7-a -mthumb -Os

int foo(int *p, int i )
      return( (i < 0 && *p == 1)
           || (i > 0 && *p == 2) );

Gcc generates:

        cmp     r1, #0
        bge     .L2
        ldr     r0, [r0, #0]
        cmp     r0, #1
        ite     ne         //  A
        movne   r0, #0     //  B
        moveq   r0, #1     //  C
        b       .L3
        it      eq
        moveq   r0, r1
        beq     .L3
        ldr     r0, [r0, #0]
        cmp     r0, #2
        ite     ne         // D
        movne   r0, #0     // E
        moveq   r0, #1     // F
        bx      lr

Instructions ABC are same as DEF, ideally ABC can be removed and generates following code:

        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        cmp     r1, #0
        bge     .L2
        ldr     r0, [r0, #0]
        cmp     r0, #1
        b       .L3
        it      eq
        moveq   r0, r1
        beq     .L3
        ldr     r0, [r0, #0]
        cmp     r0, #2
        ite     ne
        movne   r0, #0
        moveq   r0, #1
        bx      lr

This should be handled by try_crossjump_bb. But a single rtl insn is used to represent the compare and following IT block, like following. So they looks different.

(insn:TI 24 22 26 src/t1.c:3 (parallel [
            (set (reg:SI 0 r0 [orig:133 D.2006 ] [133])
                (eq:SI (reg:SI 0 r0 [145])
                    (const_int 2 [0x2])))
            (clobber (reg:CC 24 cc))
        ]) 682 {*thumb2_compare_scc} (expr_list:REG_UNUSED (reg:CC 24 cc)

(insn:TI 13 11 70 src/t1.c:3 (parallel [
            (set (reg:SI 0 r0 [orig:133 D.2006 ] [133])
                (eq:SI (reg:SI 0 r0 [142])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 24 cc))
        ]) 682 {*thumb2_compare_scc} (expr_list:REG_UNUSED (reg:CC 24 cc)

Can we break this insn into two separate ones, one is compare insn to set the cc register, the other is the IT block?
Comment 1 Richard Biener 2010-01-22 11:23:08 UTC
Probably a missed cross-jumping opportunity
Comment 2 Steven Bosscher 2010-02-08 12:14:09 UTC
Richard, can we split thumb2_compare_scc? If so, when/how would you do this? (I'm thinking of a post-RA splitter, but perhaps it could be done earlier.)
Comment 3 Richard Earnshaw 2010-02-08 16:50:58 UTC
Best to do it post RA, so that we can issue the best sequences of insns.  I have some better sequences that could be generated for Thumb2 which would avoid the need for an IT instruction in many cases.
Comment 4 Bernd Schmidt 2010-07-02 16:23:10 UTC
Subject: Bug 42835

Author: bernds
Date: Fri Jul  2 16:22:33 2010
New Revision: 161725

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161725
	PR target/42835
	* config/arm/arm-modes.def (CC_NOTB): New mode.
	* config/arm/arm.c (get_arm_condition_code): Handle it.
	* config/arm/thumb2.md (thumb2_compare_scc): Delete pattern.
	* config/arm/arm.md (subsi3_compare0_c): New pattern.
	(compare_scc): Now a define_and_split.  Add a number of extra
	splitters before it.

	PR target/42835
	* gcc.target/arm/pr42835.c: New test.


Comment 5 Ramana Radhakrishnan 2011-02-01 01:04:59 UTC
Bernd, can this now be marked as fixed for 4.6.0 ? 


Trunk today generates for the options provided the following sequence of code.

        cmp	r1, #0
	bge	.L2
	ldr	r0, [r0, #0]
	cmp	r0, #1
	b	.L5
	beq	.L4
	ldr	r0, [r0, #0]
	cmp	r0, #2
	ite	ne
	movne	r0, #0
	moveq	r0, #1
	bx	lr
	mov	r0, r1
	bx	lr
Comment 6 Bernd Schmidt 2011-02-01 01:10:36 UTC