This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC PATCH] reg_nonzero_bits valid use


On sparc -m64 in gcc-2.96-RH when compiling Linux kernel, combine suddenly
throws away (zero_extend:DI (reg:SI 298)) and replaces it with
(subreg:DI (reg:SI 298)), although the upper 32bits might be set.
This doesn't happen in gcc 3.1, but from what I read it looks just like pure
luck that try_combine is called with different i1/i2/i3 in different orders.

Here is the interesting part of
        if (buf->f_bfree < le32_to_cpu(sb->u.ext2_sb.s_es->s_r_blocks_count))
                buf->f_bavail = 0;

(insn 374 347 376 (set (reg:DI 274)
        (mem/s:DI (plus:DI (reg/v:DI 107)
                (const_int 24 [0x18])) 0)) 61 {*movdi_insn_sp64_novis} (nil)
(insn 392 372 393 (set (reg/v:SI 280)
        (mem/s:SI (plus:DI (reg:DI 348)
                (const_int 8 [0x8])) 0)) 51 {*movsi_insn} (insn_list 376 (nil))
(insn/i 400 398 401 (set (reg:SI 284)
        (and:SI (reg/v:SI 280)
            (const_int 255 [0xff]))) 228 {andsi3} (insn_list 392 (nil))

(insn/i 401 400 404 (set (reg:SI 285)
        (ashift:SI (reg:SI 284)
            (const_int 24 [0x18]))) 299 {ashlsi3} (insn_list 400 (nil))
    (expr_list:REG_DEAD (reg:SI 284)
SI 290 := 285 ior 289
SI 294 := 290 ior 293
SI 298 := 294 ior 297
(insn/i 417 416 421 (set (reg:DI 299)
        (zero_extend:DI (reg:SI 298))) 128 {*zero_extendsidi2_insn_sp64} (insn_list 414 (nil))
    (expr_list:REG_DEAD (reg:SI 298)
(insn 426 424 427 (set (reg:CCX 100 %icc)
        (compare:CCX (reg:DI 274)
            (reg:DI 299))) 1 {*cmpdi_sp64} (insn_list 374 (insn_list 417 (nil)))
    (expr_list:REG_DEAD (reg:DI 274)
        (expr_list:REG_DEAD (reg:DI 299)

First, combiner kills insn 400, since that point
nonzero_bits ((reg:SI 298), DImode) returns values with some upper bits set,
since 0xffffffff is shifted up 24 bits.
Since this register is always set as SImode reg, reg_nonzero_bits[298]
is 0xffffffff though. The problem comes when try_combine is called on insn
426 with 417 and 374 as links (note the important point that insn 374's
CUID is less than all other interesting instructions here).
This means subst_low_cuid is set to a smaller value, such that
  /* If the value was set in a later insn than the ones we are processing,
     we can't use it even if the register was only set once.  */
  if (INSN_CUID (reg_last_set[regno]) >= subst_low_cuid)
    return 0;
test in get_last_value triggers (if this did not trigger,
nonzero_bits ((reg:SI 298), DImode) would properly return 0x00ffffffffffffff
and all would be fine). As nonzero_sign_valid is true and
reg_nonzero_bits[298] is 0xffffffff, nonzero_bits returns that value and
thus realizes it can optimize (and:DI (subreg:DI (reg:SI 298) 0) (const_int 0xffffffff))
into (subreg:DI (reg:SI 298)).
This leads to assembly like:
        ld      [%o4+8], %o3
        sll     %o3, 24, %o0
        and     %o3, %l0, %o1
        sll     %o1, 8, %o1
        and     %o3, %l1, %o2
        srl     %o2, 8, %o2
        or      %o0, %o1, %o0
        srl     %o3, 24, %o3
        or      %o0, %o2, %o0
        or      %o0, %o3, %o0
        cmp     %o5, %o0
        bge,pt  %xcc, .LL1261
which doesn't work in most cases, since a srl %o0, 0, %o0 is missing
before cmp %o5, %o0.

Now I wonder what can be done about it.
Below is a patch (fixes this testcase) which only uses reg_nonzero_bits if
x's mode is not narrower as nonzero_bits MODE argument. Do you think that
will not pessimize the code too much? Alternatively, should there be an
parallel enum machine_mode array which would track for which modes were
reg_nonzero_bits resp. reg_sign_bit_copies set?

2001-12-05  Jakub Jelinek  <>

	* combine.c (nonzero_bits): Only use reg_nonzero_bits, if
	mode is not wider than GET_MODE (x).
	(num_sign_bit_copies): Likewise.

--- gcc/combine.c.jj	Thu Nov 29 01:42:24 2001
+++ gcc/combine.c	Wed Dec  5 19:33:53 2001
@@ -8095,7 +8095,8 @@ nonzero_bits (x, mode)
 	  return nonzero_bits (tem, mode);
-      else if (nonzero_sign_valid && reg_nonzero_bits[REGNO (x)])
+      else if (nonzero_sign_valid && reg_nonzero_bits[REGNO (x)]
+	       && GET_MODE_BITSIZE (GET_MODE (x)) == mode_width)
 	return reg_nonzero_bits[REGNO (x)] & nonzero;
 	return nonzero;
@@ -8471,7 +8472,8 @@ num_sign_bit_copies (x, mode)
       if (tem != 0)
 	return num_sign_bit_copies (tem, mode);
-      if (nonzero_sign_valid && reg_sign_bit_copies[REGNO (x)] != 0)
+      if (nonzero_sign_valid && reg_sign_bit_copies[REGNO (x)] != 0
+	  && GET_MODE_BITSIZE (GET_MODE (x)) == bitwidth)
 	return reg_sign_bit_copies[REGNO (x)];


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]