This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

another loop unrolling bug



  In message <13928.42995.90105.741948@ongaonga.elec.canterbury.ac.nz>you write
:
  > Jeffrey A Law writes:
  >  > 
  >  > This is causing gcc to mis-compile g++, at least on the x86.
  >  > 
  >  > Compile this testcase either on x86-linux native or with an x86 cross
  >  > compiler using -O -funroll-loops
  > 
  > This patch appears to fix the problem.
  > 
  > Sat Dec  5 16:23:31 1998  Michael Hayes  <m.hayes@elec.canterbury.ac.nz>
  > 
  > 	* loop.c (check_dbra_loop): New argument loop_info.
  > 	Update initial and final loop values in loop_info structure
  > 	if loop reversed.
Well, it fixed one problem and introduced another.


Compile with -O -funroll-loops on an x86.

typedef unsigned char U_CHAR;
union hashval {
  char *cpval;
};
static char rest_extension[] = "...";
static char va_args_name[] = "__VA_ARGS__";
struct hashnode {
  struct hashnode *next;	 
  struct hashnode *prev;
  struct hashnode **bucket_hdr;	 
  int length;			 
  U_CHAR *name;			 
  union hashval value;		 
};
typedef struct hashnode HASHNODE;
static HASHNODE *hashtab[1403 ];
U_CHAR is_idchar[256];
static HASHNODE *
install (name, len, type, value, hash)
     U_CHAR *name;
     int len;
     char *value;
     int hash;
{
  int i;
  register U_CHAR *p, *q;
  register HASHNODE *hp;
  for (i = 0; i < len; i++)
    *p++ = *q++;
  hp->name[len] = 0;
  return hp;
}


We have the following loop:

(insn 19 16 63 (set (reg/v:SI 27)
        (const_int 0)) 54 {movsi+2} (nil)
    (expr_list:REG_EQUAL (const_int 0)
        (nil)))

(insn 63 19 64 (set (cc0)
        (compare (reg/v:SI 27)
            (reg/v:SI 23))) 12 {cmpsi_1} (nil)
    (nil))

(jump_insn 64 63 20 (set (pc)
        (if_then_else (ge (cc0)
                (const_int 0))
            (label_ref 46)
            (pc))) 286 {bleu+1} (nil)
    (nil))

(note 20 64 26 "" NOTE_INSN_LOOP_BEG)

[ ... ]

(note 36 34 39 "" NOTE_INSN_LOOP_CONT)

(insn 39 36 67 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 27)
            (const_int 1))) 148 {addsi3+1} (nil)
    (nil))

(note 67 39 22 "" NOTE_INSN_LOOP_VTOP)

(insn 22 67 23 (set (cc0)
        (compare (reg/v:SI 27)
            (reg/v:SI 23))) 12 {cmpsi_1} (nil)
    (nil))

(jump_insn 23 22 45 (set (pc)
        (if_then_else (lt (cc0)
                (const_int 0))
            (label_ref 26)
            (pc))) 286 {bleu+1} (nil)
    (nil))

(note 45 23 46 "" NOTE_INSN_LOOP_END)


A simple loop where reg27 increments by one each iteration as long as the
result is less than the value in reg23.


The loop gets reversed and looks like this:


(insn 19 16 63 (set (reg/v:SI 27)
        (const_int 0)) 54 {movsi+2} (nil)
    (expr_list:REG_EQUAL (const_int 0)
        (nil)))

(insn 63 19 64 (set (cc0)
        (compare (reg/v:SI 27)
            (reg/v:SI 23))) 12 {cmpsi_1} (nil)
    (nil))

(jump_insn 64 63 69 (set (pc)
        (if_then_else (ge (cc0)
                (const_int 0))
            (label_ref 46)
            (pc))) 286 {bleu+1} (nil)
    (nil))

(insn 69 64 20 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 23)
            (const_int 0))) -1 (nil)
    (nil))

(note 20 69 26 "" NOTE_INSN_LOOP_BEG)

[ ... ]

(note 36 34 72 "" NOTE_INSN_LOOP_CONT)

(insn 72 36 67 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 27)
            (const_int -1))) -1 (nil)
    (nil))

(note 67 72 73 "" NOTE_INSN_LOOP_VTOP)

(insn 73 67 74 (set (cc0)
        (reg/v:SI 27)) -1 (nil)
    (nil))

(jump_insn 74 73 45 (set (pc)
        (if_then_else (ne (cc0)
                (const_int 0))
            (label_ref 26)
            (pc))) -1 (nil)
    (nil))

(note 45 74 46 "" NOTE_INSN_LOOP_END)

Now it's a count-down loop as long as reg27 is not zero.  reg27 is initialized
to the value of reg23 before the loop starts.

After unrolling we get:



(insn 19 16 63 (set (reg/v:SI 27)
        (const_int 0)) 54 {movsi+2} (nil)
    (expr_list:REG_EQUAL (const_int 0)
        (nil)))

(insn 63 19 64 (set (cc0)
        (compare (reg/v:SI 27)
            (reg/v:SI 23))) 12 {cmpsi_1} (nil)
    (nil))

(jump_insn 64 63 69 (set (pc)
        (if_then_else (ge (cc0)
                (const_int 0))
            (label_ref 46)
            (pc))) 286 {bleu+1} (nil)
    (nil))

(insn 69 64 76 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 23)
            (const_int 0))) -1 (nil)
    (nil))

(insn 76 69 78 (set (reg:SI 37)
        (const_int 0)) -1 (nil)
    (nil))

(insn 78 76 80 (set (reg:SI 38)
        (plus:SI (reg/v:SI 23)
            (const_int 0))) -1 (nil)
    (nil))

(insn 80 78 81 (set (reg:SI 36)
        (minus:SI (reg:SI 37)
            (reg:SI 38))) -1 (nil)
    (nil))

(insn 81 80 83 (set (reg:SI 39)
        (and:SI (reg:SI 36)
            (const_int 3))) -1 (nil)
    (nil))

(insn 83 81 84 (set (reg:SI 40)
        (plus:SI (reg/v:SI 23)
            (const_int 0))) -1 (nil)
    (nil))

(insn 84 83 85 (set (cc0)
        (reg:SI 40)) -1 (nil)
    (nil))

(jump_insn 85 84 86 (set (pc)
        (if_then_else (ge (cc0)
                (const_int 0))
            (label_ref 108)
            (pc))) -1 (nil)
    (nil))

(insn 86 85 87 (set (cc0)
        (reg:SI 39)) -1 (nil)
    (nil))

(jump_insn 87 86 88 (set (pc)
        (if_then_else (eq (cc0)
                (const_int 0))
            (label_ref 117)
            (pc))) -1 (nil)
    (nil))

[ ... ]

[ label 108 is precondition-1 code, label 117 starts a 4x unrolled loop. ]


Note carefully insn 85 which jumps into the preconditioning code occurs before
the code which checks if the preconditioning code is needed at all (insn 87).

The net result is we jump into the precondition 1 code, which performs one
iteration of the loop and decrements the counter.  If the counter is nonzero
we then fall until a 4x unrolled loop.

Now think about what happens if the loop did not need any preconditioning...
(ie reg23 is a multiple of 4).  The loop counter will never be equal to zero
at the bottom of the 4x unrolled loop and we iterate forever.

BAsed on looking at rtl dumps for the old compiler, I think you've somehow
reversed the condition.  With the old compiler that branch was not taken.

It might be use for you to try a bootstrap on an x86 with
BOOT_CFLAGS="-O -funroll-all-loops" if you've got an x86 handy.


jeff



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]