The texinfo -march=pentiumpro bug

Jeffrey A Law law@cygnus.com
Wed Jul 14 22:29:00 GMT 1999


Compiling the attached testcase with -O2 -fomit-frame-pointer -march=pentiumpro
will cause the compiler to generate incorrect code.

The relevant insns in the .lreg dump look like:

(insn 29 27 32 (set (reg/v:SI 27)
        (reg:SI 0 %eax)) 48 {movsi+2} (insn_list 27 (nil))
    (expr_list:REG_DEAD (reg:SI 0 %eax)
        (nil)))

(insn 32 29 147 (set (reg:SI 7 %esp)
        (plus:SI (reg:SI 7 %esp)
            (const_int 16 [0x10]))) 205 {addsi3+1} (nil)
    (nil))

(note 147 32 149 "" NOTE_INSN_DELETED)

(insn 149 147 175 (set (reg/v:SI 27)
        (if_then_else:SI (ge:SI (reg/v:SI 27)
                (const_int 0 [0x0]))
            (reg/v:SI 27)
            (mem/f:SI (symbol_ref:SI ("size_of_input_text")) 2))) 414 {movsicc+2} (insn_list 29 (nil))
    (nil))

insn 29's reloads look like:
Reload 0: reload_out (SI) = (mem:SI (plus:SI (reg:SI 7 %esp)
                                                        (const_int 44 [0x2c])) 0)
        GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
        reload_out_reg: (reg/v:SI 27)

insn 149's reloads look like:

Reload 0: reload_in (SI) = (reg/v:SI 27)
        reload_out (SI) = (reg/v:SI 27)
        GENERAL_REGS, RELOAD_OTHER (opnum = 0)
        reload_in_reg: (reg/v:SI 27)
        reload_out_reg: (reg/v:SI 27)
        reload_reg_rtx: (reg:SI 0 %eax)
Reload 1: reload_in (SI) = (mem/f:SI (symbol_ref:SI ("size_of_input_text")) 2)
        GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 5), optional
        reload_in_reg: (mem/f:SI (symbol_ref:SI ("size_of_input_text")) 2)

Note that reload 0 for insn 149 only refers to the references to reg27 as an
output and as a source to the cmov itself -- ie, it does not apply to the
reference to reg27 as part of the conditional!


At the start of emit_reload_insns for insn 149 the relevant insns look like:

(insn 29 27 32 (set (mem:SI (plus:SI (reg:SI 7 %esp)
                (const_int 44 [0x2c])) 0)
        (reg:SI 0 %eax)) 48 {movsi+2} (insn_list 27 (nil))
    (expr_list:REG_DEAD (reg:SI 0 %eax)
        (nil)))

(insn 32 29 147 (set (reg:SI 7 %esp)
        (plus:SI (reg:SI 7 %esp)
            (const_int 16 [0x10]))) 205 {addsi3+1} (nil)
    (nil))

(note 147 32 149 "" NOTE_INSN_DELETED)

(insn 149 147 175 (set (reg/v:SI 27)
        (if_then_else:SI (ge:SI (reg/v:SI 27)
                (const_int 0 [0x0]))
            (reg/v:SI 27)
            (mem/f:SI (symbol_ref:SI ("size_of_input_text")) 2))) 414 {movsicc+2} (insn_list 29 (nil))
    (nil))


At this point we have already lost, it just takes a little time for the problem
to expose itself.  The problem is we will end up deleting insn 29 because
we believe that the value in the stack slot will not be used before it is
overwritten by the output reload for insn 149.

The inheritance coded requires that pseudos which do not get hard registers and
are not referenced by reloads have a USE insn before insn in which they are
used.  Otherwise the code to delete useless reloads will not be aware of the
use of the pseudo.  This is precisely what happens in this case.

The reloads for reg27 in insn 149 will be satisfied by the value lying around
in %eax.  insn 29 and insn 149 both have an output reload for the stack slot
and there are no uses of reg27 between those insns, so reload deletes the output
reload in insn 29.

This results in rtl that looks something like this:

(note 29 27 32 "" NOTE_INSN_DELETED)

(insn 32 29 147 (set (reg:SI 7 %esp)
        (plus:SI (reg:SI 7 %esp)
            (const_int 16 [0x10]))) 205 {addsi3+1} (nil)
    (nil))

(note 147 32 190 "" NOTE_INSN_DELETED)

(insn 190 147 191 (set (cc0)
        (mem:SI (plus:SI (reg:SI 7 %esp)
                (const_int 28 [0x1c])) 0)) -1 (nil)
    (nil))

(insn 191 190 179 (set (reg:SI 0 %eax)
        (if_then_else:SI (ge:SI (cc0)
                (const_int 0 [0x0]))
            (reg:SI 0 %eax)
            (mem/f:SI (symbol_ref:SI ("size_of_input_text")) 2))) -1 (nil)
    (nil))

(insn 179 191 175 (set (mem:SI (plus:SI (reg:SI 7 %esp)
                (const_int 28 [0x1c])) 0)
        (reg:SI 0 %eax)) 48 {movsi+2} (nil)
    (nil))


Note how insn 29 was deleted, but that insn 190 reads from the stack slot
that insn 29 would have set if it had not been deleted (insn 190/191 were
generated by splitting insn 149 after reload had finished).

Working backwards we find the real culprit is find_reloads not creating the
USE insn that is required by the later phase of reload.

The code looks something like this:

    else if (goal_alternative_matched[i] < 0
             && goal_alternative_matches[i] < 0
             && optimize)
      {
        /* For each non-matching operand that's a MEM or a pseudo-register
           that didn't get a hard register, make an optional reload.
           This may get done even if the insn needs no reloads otherwise.  */

       [ ... Big if condition which is not satisfied for the reference to reg27
         in insn 149). ]

        /* If a memory reference remains, yet we can't make an optional
           reload, check if this is actually a pseudo register reference;
           we then need to emit a USE and/or a CLOBBER so that reload
           inheritance will do the right thing.  */
        else if (replace && GET_CODE (operand) == MEM)

The second condition is not satisfied either because operand is a pseudo, not
a MEM.  There isn't any code later to deal with this case.  So we never
emit a USE.

Fixing this is simple and safe.  We tweak the above code to emit a use for
a pseudo which did not get a hard register.


The testcase:

 char *    input_text  =   (char *)((void *)0)   ;
 int    size_of_input_text  =   0  ;
 int    input_text_offset  =   0  ;
 int    line_number  =   0  ;
int
get_until (match, string)
     char *match, **string;
{
  int len, current_point, x, new_point, tem;
  current_point = x = input_text_offset;
  new_point = search_forward (match, input_text_offset);
  if (new_point < 0)
    new_point = size_of_input_text;
  len = new_point - current_point;
  tem = new_point + (strlen (match) - 1);
  while (x != tem)
    if (input_text[x++] == '\n')
      line_number++;
  *string = (char *)xmalloc (len + 1);
  memcpy (*string, &input_text[current_point], len);
  (*string)[len] = 0;
  input_text_offset = tem;
  return (new_point);
}
	* reload.c (find_reloads): Emit a USE for a pseudo register without
	a hard register if we could not create an optional reload for the
	pseudo.
	
Index: reload.c
===================================================================
RCS file: /egcs/carton/cvsfiles/egcs/gcc/reload.c,v
retrieving revision 1.71.4.1
diff -c -3 -p -r1.71.4.1 reload.c
*** reload.c	1999/05/27 01:16:21	1.71.4.1
--- reload.c	1999/07/15 05:26:15
*************** find_reloads (insn, replace, ind_levels,
*** 3803,3813 ****
  			   (insn_code_number < 0 ? 0
  			    : insn_operand_strict_low[insn_code_number][i]),
  			   1, i, operand_type[i]);
! 	/* If a memory reference remains, yet we can't make an optional
  	   reload, check if this is actually a pseudo register reference;
  	   we then need to emit a USE and/or a CLOBBER so that reload
  	   inheritance will do the right thing.  */
! 	else if (replace && GET_CODE (operand) == MEM)
  	  {
  	    operand = *recog_operand_loc[i];
  
--- 3803,3818 ----
  			   (insn_code_number < 0 ? 0
  			    : insn_operand_strict_low[insn_code_number][i]),
  			   1, i, operand_type[i]);
! 	/* If a memory reference remains (either as a MEM or a pseudo that
! 	   did not get a hard register), yet we can't make an optional
  	   reload, check if this is actually a pseudo register reference;
  	   we then need to emit a USE and/or a CLOBBER so that reload
  	   inheritance will do the right thing.  */
! 	else if (replace
! 		 && (GET_CODE (operand) == MEM
! 		     || (GET_CODE (operand) == REG
! 			 && REGNO (operand) >= FIRST_PSEUDO_REGISTER
! 			 && reg_renumber [REGNO (operand)] < 0)))
  	  {
  	    operand = *recog_operand_loc[i];
  














More information about the Gcc-patches mailing list