Bug 105477 - RISC-V: Regression: Useless moves in conditional select return
Summary: RISC-V: Regression: Useless moves in conditional select return
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, ra
Depends on:
Blocks:
 
Reported: 2022-05-04 10:46 UTC by Christoph Müllner
Modified: 2024-07-01 18:28 UTC (History)
3 users (show)

See Also:
Host:
Target: riscv
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Müllner 2022-05-04 10:46:12 UTC
Commit 3a7ba8fd triggers a regression so that on RISC-V two useless move instructions are generated.

Test code:
"""
long test(long a, long b, long c)
{
  return (!c ? a : b);
}
"""

GCC 10.2.0, GCC 11 or upstream/master before 3a7ba8fd generates (rv64gc + -O3):
test:
        beq     a2,zero,.L2
        mv      a0,a1
.L2:
        ret

Current upstream/master generates:
<test>:
   0:   87aa                    mv      a5,a0
   2:   852e                    mv      a0,a1
   4:   e211                    bnez    a2,8 <.L2>
   6:   853e                    mv      a0,a5
<.L2>:
   8:   8082                    ret

This might be an issue in the ifcvt code (in combination of the RISC-V backend) or something where the RISC-V backend needs to improve.

Some context to this issue:
* The mentioned change (3a7ba8fd) is not problematic at all and fixes an issue PR104960
* PR105314 reports a similar issue, that is also triggered by the same change
Comment 1 Richard Biener 2022-05-04 10:58:42 UTC
could also be condition RTL expansion generating slightly different RTL IL.
Comment 2 Christoph Müllner 2022-05-09 10:17:27 UTC
I've analysed this issue a bit more and want to share my observations.

I mention commit 3a7ba8fd here again as trigger of this issue, but not
as the underlying issue (which I did not fully understand so far).

When looking into the dump files, the input to the sink2 pass (output of forwprop4) is:
long int test (long int a, long int b, long int c)
{ 
  long int iftmp.0_1;
  <bb 2> [local count: 1073741824]:
  if (c_2(D) == 0)
    goto <bb 4>; [50.00%] 
  else                    
    goto <bb 3>; [50.00%] 
  <bb 3> [local count: 536870912]:
  <bb 4> [local count: 1073741824]:
  # iftmp.0_1 = PHI <a_4(D)(2), b_3(D)(3)>
  return iftmp.0_1;
}

Before commit 3a7ba8fd this did not get further changed up to the expand pass.
Since commit 3a7ba8fd the sink2 pass transforms into the following:
long int test (long int a, long int b, long int c)
{
  long int iftmp.0_1;
  <bb 2> [local count: 1073741824]:
  if (c_2(D) == 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 5>; [50.00%]
  <bb 5> [local count: 536870912]:
  goto <bb 4>; [100.00%]
  <bb 3> [local count: 536870912]:
  <bb 4> [local count: 1073741824]:
  # iftmp.0_1 = PHI <a_4(D)(3), b_3(D)(5)>
  return iftmp.0_1;
}

This has an impact on
* the expansion pass (for obvious reasons)
* the output of the combiner pass
* the result of the reload pass

Let's start with the behaviour before change 3a7ba8fd.
The expander generated the following output:
(insn 2 7 3 2 (set (reg/v:DI 73 [ a ])
        (reg:DI 10 a0 [ a ])) "pr105477.c":2:1 -1
     (nil))
(insn 3 2 4 2 (set (reg/v:DI 74 [ b ])
        (reg:DI 11 a1 [ b ])) "pr105477.c":2:1 -1
     (nil))
(insn 4 3 5 2 (set (reg/v:DI 75 [ c ])
        (reg:DI 12 a2 [ c ])) "pr105477.c":2:1 -1
     (nil))
(jump_insn 9 5 10 2 (set (pc)
        (if_then_else (eq (reg/v:DI 75 [ c ])
                (const_int 0 [0]))
            (label_ref 11)
            (pc))) "pr105477.c":3:17 -1
     (int_list:REG_BR_PROB 536870916 (nil))
 -> 11)
(insn 6 10 11 4 (set (reg/v:DI 73 [ a ])
        (reg/v:DI 74 [ b ])) "pr105477.c":3:17 -1
     (nil))
(code_label 11 6 12 5 2 (nil) [1 uses])
(insn 13 12 17 5 (set (reg:DI 72 [ <retval> ])
        (reg/v:DI 73 [ a ])) "pr105477.c":3:17 -1
     (nil))
(insn 17 13 18 5 (set (reg/i:DI 10 a0)
        (reg:DI 72 [ <retval> ])) "pr105477.c":4:1 -1
     (nil))
(insn 18 17 0 5 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

The combiner then converted to:
(insn 30 7 2 2 (set (reg:DI 77)
        (reg:DI 10 a0 [ a ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 10 a0 [ a ])
        (nil)))
(insn 2 30 31 2 (set (reg/v:DI 73 [ a ])
        (reg:DI 77)) "pr105477.c":2:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 77)
        (nil)))
(insn 31 2 3 2 (set (reg:DI 78)
        (reg:DI 11 a1 [ b ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 11 a1 [ b ])
        (nil)))
(insn 3 31 32 2 (set (reg/v:DI 74 [ b ])
        (reg:DI 78)) "pr105477.c":2:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 78)
        (nil)))
(insn 32 3 4 2 (set (reg:DI 79)
        (reg:DI 12 a2 [ c ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 12 a2 [ c ])
        (nil)))
(jump_insn 9 5 10 2 (set (pc)
        (if_then_else (eq (reg:DI 79)
                (const_int 0 [0]))
            (label_ref:DI 11)
            (pc))) "pr105477.c":3:17 182 {*branchdi}
     (expr_list:REG_DEAD (reg:DI 79)
        (int_list:REG_BR_PROB 536870916 (nil)))
 -> 11)
(insn 6 10 11 3 (set (reg/v:DI 73 [ a ])
        (reg/v:DI 74 [ b ])) "pr105477.c":3:17 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 74 [ b ])
        (nil)))
(code_label 11 6 12 4 2 (nil) [1 uses])
(insn 17 12 18 4 (set (reg/i:DI 10 a0)
        (reg/v:DI 73 [ a ])) "pr105477.c":4:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 73 [ a ])
        (nil)))
(insn 18 17 0 4 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

This could be simplified by the reload pass to:
(jump_insn 9 5 10 2 (set (pc)
        (if_then_else (eq (reg:DI 12 a2 [79])
                (const_int 0 [0]))
            (label_ref:DI 11)
            (pc))) "pr105477.c":3:17 182 {*branchdi}
     (int_list:REG_BR_PROB 536870916 (nil))
 -> 11)
(insn 6 10 11 3 (set (reg/v:DI 10 a0 [orig:73 a ] [73])
        (reg/v:DI 11 a1 [orig:74 b ] [74])) "pr105477.c":3:17 135 {*movdi_64bit}
     (nil))
(code_label 11 6 12 4 2 (nil) [1 uses])
(insn 18 12 36 4 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

The resulting assembly did not contain the two useless move instructions.

Since commit 3a7ba8fd the expanded code looks like this:
(insn 2 7 3 2 (set (reg/v:DI 73 [ a ])
        (reg:DI 10 a0 [ a ])) "pr105477.c":2:1 -1
     (nil))
(insn 3 2 4 2 (set (reg/v:DI 74 [ b ])
        (reg:DI 11 a1 [ b ])) "pr105477.c":2:1 -1
     (nil))
(insn 4 3 5 2 (set (reg/v:DI 75 [ c ])
        (reg:DI 12 a2 [ c ])) "pr105477.c":2:1 -1
     (nil))
(jump_insn 9 5 10 2 (set (pc)
        (if_then_else (ne (reg/v:DI 75 [ c ])
                (const_int 0 [0]))
            (label_ref 11)
            (pc))) "pr105477.c":3:17 -1
     (int_list:REG_BR_PROB 536870916 (nil))
 -> 11)
(insn 6 10 11 4 (set (reg/v:DI 74 [ b ])
        (reg/v:DI 73 [ a ])) "pr105477.c":3:17 -1
     (nil))
(code_label 11 6 12 5 2 (nil) [1 uses])
(insn 13 12 17 5 (set (reg:DI 72 [ <retval> ])
        (reg/v:DI 74 [ b ])) "pr105477.c":3:17 -1
     (nil))
(insn 17 13 18 5 (set (reg/i:DI 10 a0)
        (reg:DI 72 [ <retval> ])) "pr105477.c":4:1 -1
     (nil))
(insn 18 17 0 5 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

The code remains like this and gets changed by the combiner later on to:
(insn 24 7 2 2 (set (reg:DI 77)
        (reg:DI 10 a0 [ a ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 10 a0 [ a ])
        (nil)))
(insn 2 24 25 2 (set (reg/v:DI 73 [ a ])
        (reg:DI 77)) "pr105477.c":2:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 77)
        (nil)))
(insn 25 2 3 2 (set (reg:DI 78)
        (reg:DI 11 a1 [ b ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 11 a1 [ b ])
        (nil)))
(insn 3 25 26 2 (set (reg/v:DI 74 [ b ])
        (reg:DI 78)) "pr105477.c":2:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 78)
        (nil)))
(insn 26 3 4 2 (set (reg:DI 79)
        (reg:DI 12 a2 [ c ])) "pr105477.c":2:1 -1
     (expr_list:REG_DEAD (reg:DI 12 a2 [ c ])
        (nil)))
(jump_insn 9 5 10 2 (set (pc)
        (if_then_else (ne (reg:DI 79)
                (const_int 0 [0]))
            (label_ref 11)
            (pc))) "pr105477.c":3:17 182 {*branchdi}
     (expr_list:REG_DEAD (reg:DI 79)
        (int_list:REG_BR_PROB 536870916 (nil)))
 -> 11)
(insn 6 10 11 3 (set (reg/v:DI 74 [ b ])
        (reg/v:DI 73 [ a ])) "pr105477.c":3:17 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 73 [ a ])
        (nil)))
(code_label 11 6 12 4 2 (nil) [1 uses])
(insn 17 12 18 4 (set (reg/i:DI 10 a0)
        (reg/v:DI 74 [ b ])) "pr105477.c":4:1 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/v:DI 74 [ b ])
        (nil)))
(insn 18 17 0 4 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

The reload pass can then simplify as to the following:
(insn 2 5 3 2 (set (reg/v:DI 15 a5 [orig:73 a ] [73])
        (reg:DI 10 a0 [77])) "pr105477.c":2:1 135 {*movdi_64bit}
     (nil))
(insn 3 2 9 2 (set (reg/v:DI 10 a0 [orig:74 b ] [74])
        (reg:DI 11 a1 [78])) "pr105477.c":2:1 135 {*movdi_64bit}
     (nil))
(jump_insn 9 3 10 2 (set (pc)
        (if_then_else (ne (reg:DI 12 a2 [79])
                (const_int 0 [0]))
            (label_ref 11)
            (pc))) "pr105477.c":3:17 182 {*branchdi}
     (int_list:REG_BR_PROB 536870916 (nil))
 -> 11)
(insn 6 10 11 3 (set (reg/v:DI 10 a0 [orig:74 b ] [74])
        (reg/v:DI 15 a5 [orig:73 a ] [73])) "pr105477.c":3:17 135 {*movdi_64bit}
     (nil))
(code_label 11 6 12 4 2 (nil) [1 uses])
(insn 18 12 30 4 (use (reg/i:DI 10 a0)) "pr105477.c":4:1 -1
     (nil))

The remaining two set statements above will become move instructions.
Comment 3 Dimitar Dimitrov 2024-07-01 18:28:02 UTC
Commit r15-1579-g792f97b44ffc5e improves the generated code:

test:
        bne     a2,zero,.L2
        mv      a1,a0
.L2:
        mv      a0,a1
        ret