This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/30213] New: Wrong code with optimized memset() (possible bug in RTL bbro optimizer)


The code in attached testcase is taken from povray-3.6.1 and produces a nasty
regression, exposed by new optimized string functions. Please note, that
expanded RTL of pov_calloc() function is OK, but subsequent RTL optimization
(bbro) mixes BBs in the wrong order.

It is evident, that %ebx is cleared in BB4, and dies in BB5. This dump is from
_.148r.rnreg:

--cut here--
;; Start of basic block 4, registers live: 0 [ax] 1 [dx] 4 [si] 5 [di] 6 [bp] 7
[sp] 20 [frame]
;; Pred edge  3 [40.0%]  (fallthru)
(note:HI 72 29 119 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn 119 72 31 4 (parallel [
            (set (reg:SI 3 bx [68])
                (const_int 0 [0x0]))
            (clobber (reg:CC 17 flags))
        ]) 38 {*movsi_xor} (nil)
    (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(note:HI 31 119 89 4 NOTE_INSN_DELETED)

(insn 89 31 33 4 (set (reg:CCZ 17 flags)
        (compare:CCZ (and:SI (reg:SI 0 ax [orig:59 block ] [59])
                (const_int 1 [0x1]))
            (const_int 0 [0x0]))) 286 {testsi_1} (nil)
    (expr_list:REG_DEAD (reg:SI 0 ax [orig:59 block ] [59])
        (nil)))

(jump_insn:HI 33 89 73 4 (set (pc)
        (if_then_else (eq (reg:CCZ 17 flags)
                (const_int 0 [0x0]))
            (label_ref 36)
            (pc))) 530 {*jcc_1} (insn_list:REG_DEP_TRUE 32 (nil))
    (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (expr_list:REG_BR_PROB (const_int 9000 [0x2328])
            (nil))))
;; End of basic block 4, registers live: 1 [dx] 3 [bx] 4 [si] 5 [di] 6 [bp] 7
[sp] 20 [frame]
;; Succ edge  6 [90.0%] 
;; Succ edge  5 [10.0%]  (fallthru)

;; Start of basic block 5, registers live: 1 [dx] 3 [bx] 4 [si] 5 [di] 6 [bp] 7
[sp] 20 [frame]
;; Pred edge  4 [10.0%]  (fallthru)
(note:HI 73 33 95 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

(insn 95 73 34 5 (set (reg:QI 0 ax)
        (reg:QI 3 bx)) 55 {*movqi_1} (nil)
    (nil))

(insn:HI 34 95 35 5 (parallel [
            (set (mem:QI (reg/f:SI 5 di [orig:67 block ] [67]) [0 S1 A8])
                (reg:QI 0 ax))
            (set (reg/f:SI 5 di [orig:67 block ] [67])
                (plus:SI (reg/f:SI 5 di [orig:67 block ] [67])
                    (const_int 1 [0x1])))
        ]) 720 {*strsetqi_1} (nil)
    (expr_list:REG_DEAD (reg:QI 0 ax)
        (nil)))
--cut here--

However, _.149.bbro renames BB4 and BB5 into BB12 and BB17 respectively, where
BB12 can be reached _conditionally_ from BB3.

This produces wrong code for pov_calloc():

--cut here--
        movl    %eax, %esi      #, block
        testl   %eax, %eax      # block
        je      .L4     #,                          <<< check for NULL
        movl    %ebx, %edx      # actsize, actsize
        movl    %eax, %edi      # block, block
        cmpl    $3, %ebx        #, actsize          <<< memset check for "< 4"
        ja      .L13    #,                          <<< jump only for < 4
        testb   $2, %dl #, actsize
        jne     .L14    #,                      <<< here we go with wrong %ebx
.L9:
        andb    $1, %dl #, actsize
        jne     .L15    #,                      <<< here too.
.L4:
        movl    %esi, %eax      # block, <result>
        addl    $16, %esp       #,
        popl    %ebx    #
        popl    %esi    #
        popl    %edi    #
        popl    %ebp    #
        ret
.L15:
        movl    %ebx, %eax      #,                 <<< wrong %ebx moved to %eax
        stosb                                      <<< FUBAR 2.
        movl    %esi, %eax      # block, <result>
        addl    $16, %esp       #,
        popl    %ebx    #
        popl    %esi    #
        popl    %edi    #
        popl    %ebp    #
        ret
.L14:
        movl    %ebx, %eax      #,                 <<< wrong %ebx moved to %eax
        stosw                                      <<< FUBAR 1.
        andb    $1, %dl #, actsize
        je      .L4     #,
        jmp     .L15    #
.L13:
        xorl    %ebx, %ebx      # tmp68            <<< %ebx is cleared here!!
        testb   $1, %al #, block
        jne     .L16    #,
.L7:
        testl   $2, %edi        #, block
        .p2align 4,,5
        jne     .L17    #,
.L8:
        movl    %edx, %ecx      # actsize, tmp71
        shrl    $2, %ecx        #, tmp71
        movl    %ebx, %eax      # tmp68,
        rep
        stosl
        testb   $2, %dl #, actsize
        je      .L9     #,
        jmp     .L14    #
.L16:
        movl    %ebx, %eax      #,      <<< this part is OK, but for size > 4
        stosb
        subl    $1, %edx        #, actsize
        jmp     .L7     #
.L17:
        movl    %ebx, %eax      #,      <<< this part is OK, but for size > 4
        stosw
        subl    $2, %edx        #, actsize
        jmp     .L8     #

--cut here--

This can be confirmed by running the testcase:

> gcc -O2 -m32 -march=pentium4 -minline-all-stringops -DSIZE=1 mem.c
> ./a.out
Aborted
> gcc -O2 -m32 -march=pentium4 -minline-all-stringops -DSIZE=2 mem.c
> ./a.out
Aborted
> gcc -O2 -m32 -march=pentium4 -minline-all-stringops -DSIZE=4 mem.c
> ./a.out
> echo $?
0


-- 
           Summary: Wrong code with optimized memset() (possible bug in RTL
                    bbro optimizer)
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: major
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: ubizjak at gmail dot com
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30213


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]