Codegen bug with strength reduction - more details

Zack Weinberg zack@rabi.columbia.edu
Thu Feb 11 10:48:00 GMT 1999


Yesterday I reported a codegen bug in the current snapshot.  Here are
more details.

The bug is in strength reduction.  Suppose you have a loop like this:

count /= 4;
--src;
do
{
   c = *++src;
   *dest++ = c;
   if (!c) break;
   c = *++src;
   *dest++ = c;
   if (!c) break;
   c = *++src;
   *dest++ = c;
   if (!c) break;
   c = *++src;
   *dest++ = c;
   if (!c) break;
}
while (--count);

i.e. a string copy manually unrolled four times.

Before loop, we have RTL like this.  Notice how we jump over the
BEG-CONT region of the loop the first time.

(note 22 20 107 "" NOTE_INSN_LOOP_BEG)

(jump_insn 107 22 108 (set (pc)
        (label_ref 23)) -1 (nil)
    (nil))

(barrier 108 107 92)

(code_label 92 108 95 9 "")

(insn 95 92 96 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 27)
            (const_int -1))) 148 {addsi3+1} (nil)
    (nil))

(insn 96 95 97 (set (cc0)
        (reg/v:SI 27)) 0 {tstsi_1} (nil)
    (nil))

(jump_insn 97 96 104 (set (pc)
        (if_then_else (eq (cc0)
                (const_int 0))
            (label_ref 112)
            (pc))) 288 {bleu+1} (nil)
    (nil))

(note 104 97 23 "" NOTE_INSN_LOOP_CONT)

(code_label 23 104 25 3 "")

(note 25 23 28 "" NOTE_INSN_DELETED)

(insn 28 25 29 (set (reg/v:QI 25)
        (mem:QI (reg/v:SI 23) 0)) 64 {movqi+1} (nil)
    (nil))

(insn 29 28 32 (set (reg/v:SI 23)
        (plus:SI (reg/v:SI 23)
            (const_int 1))) 148 {addsi3+1} (nil)
    (nil))

(insn 32 29 34 (set (reg/v:SI 22)
        (plus:SI (reg/v:SI 22)
            (const_int 1))) 148 {addsi3+1} (nil)
    (nil))

(insn 34 32 36 (set (mem:QI (reg/v:SI 22) 0)
        (reg/v:QI 25)) 64 {movqi+1} (nil)
    (nil))

(insn 36 34 37 (set (cc0)
        (reg/v:QI 25)) 4 {tstqi_1} (nil)
    (nil))

(jump_insn 37 36 45 (set (pc)
        (if_then_else (eq (cc0)
                (const_int 0))
            (label_ref 112)
            (pc))) 288 {bleu+1} (nil)
    (nil))

; repeat three times, with the last jump slightly different:

(jump_insn 88 87 111 (set (pc)
        (if_then_else (ne (cc0)
                (const_int 0))
            (label_ref 92)
            (pc))) 288 {bleu+1} (nil)
    (nil))

(note 111 88 112 "" NOTE_INSN_LOOP_END)

(code_label 112 111 115 4 "")

Strength reduction tries to create four source and four destination
pointers each of which advance with stride 4.  Anyway I think that's
what it tries to do.  The result is severely pessimized, but more
importantly, it's incorrect.  The loop header becomes

(note 22 158 107 "" NOTE_INSN_LOOP_BEG)

(jump_insn 107 22 108 (set (pc)
        (label_ref 23)) -1 (nil)
    (nil))

(barrier 108 107 92)

(code_label 92 108 95 9 "")

(insn 95 92 128 (set (reg/v:SI 27)
        (plus:SI (reg/v:SI 27)
            (const_int -1))) -1 (nil)
    (nil))

(insn 128 95 146 (set (reg:SI 37)
        (plus:SI (reg/v:SI 22)
            (const_int 1))) -1 (nil)
    (nil))

(insn 146 128 134 (set (reg:SI 40)
        (plus:SI (reg/v:SI 23)
            (const_int 1))) -1 (nil)
    (nil))

(insn 134 146 96 (set (reg:SI 38)
        (plus:SI (reg/v:SI 23)
            (const_int 3))) -1 (nil)
    (nil))

(insn 96 134 97 (set (cc0)
        (reg/v:SI 27)) -1 (nil)
    (nil))

(jump_insn 97 96 104 (set (pc)
        (if_then_else (eq (cc0)
                (const_int 0))
            (label_ref/s 112)
            (pc))) -1 (nil)
    (nil))

(note 104 97 23 "" NOTE_INSN_LOOP_CONT)

(code_label 23 104 25 3 "")

Regs 37 and 40 are initialized only in this sequence, and it's skipped
on the first trip through the loop.  The insns initializing these
variables belong above the loop begin note.

No later pass corrects this, and we eventually wind up with assembly
output like this:

	jmp .L3
	.p2align 4,,7
.L9:
	leal 1(%ecx),%ebx
	leal 3(%ecx),%edi
	decl -8(%ebp)
	jz .L4
.L3:
	movb -3(%edi),%dl

with no previous initialization of %edi in the function.

The bug is very sensitive to the precise form of the loop.  It won't
be triggered by the snippet above.  Here is a complete function that
does trigger the bug.

I'd add this to the testsuite, but I can't figure out how to detect
the missing initialization in a system-indepedent manner.

zw

typedef unsigned long size_t;

char *
strncpy (s1, s2, n)
     char *s1;
     const char *s2;
     size_t n;
{
    char c;
    char *s = s1;

    size_t n4;

    n4 = n >> 2;
    s1--;

    for (;;)
    {
	c = *s2++;
	*++s1 = c;
	if (c == '\0')
            break;
	c = *s2++;
	*++s1 = c;
	if (c == '\0')
            break;
	c = *s2++;
	*++s1 = c;
	if (c == '\0')
            break;
	c = *s2++;
	*++s1 = c;
	if (c == '\0')
            break;
	if (--n4 == 0)
            break;
    }

    return s;
}


More information about the Gcc-bugs mailing list