I tried your patch. It did remove the redundant memory load. Following is the

        push    {lr}
        ldr     r3, [r1]
        str     r3, [r0]
        mov     r2, r3          // M
        cmp     r3, #0
        bne     .L5
        b       .L3
        ldr     r3, [r3, #8]
        b       .L6
        ldr     r1, [r3, #4]
        cmp     r1, #0
        beq     .L4
        str     r2, [r0, #12]
        @ sp needed for prologue
        pop     {pc}

In pass ifcvt it noticed the difference of two stores is the different pseudo
register number and there is no conflict between the two pseudo registers, so
it rename one of them to the same as another and do basic block cross jump on
them earlier. Then pass iterate.c.161r.cse2 detected the redundant load and
remove it.

But it introduced another redundant move instruction marked as M. At the place
r2 is used, r3 still contain the same result as r2, so we can also use r3
there. I think this is another problem.



