[PATCH] MIPS: Fix GCC `noreorder' for undefined R5900 short loops

Maciej W. Rozycki macro@linux-mips.org
Sun Jul 21 22:22:00 GMT 2019


Hi Fredrik,

> The present problem is related to GCC and the R5900 short loop bug[1]. It
> turns out GCC emits assembly code like
> 
> loop:	addiu	$5,$5,1
> 	addiu	$4,$4,1
> 	lb	$2,-1($5)
> 	.set	noreorder
> 	.set	nomacro
> 	bne	$2,$3,loop
> 	sb	$2,-1($4)
> 	.set	macro
> 	.set	reorder
> 
> that is undefined for the R5900 (this short loop has five instructions),
> for simple C code such as
> 
> 	while ((*s++ = *p++) != '\n')
> 		;
> 
> in the kernel. The noreorder directive prohibits GAS from corrections, and
> GAS really ought to give an error for it, I think. In the meantime, I have a
> tool that does machine code analysis of ELF objects to catch undefined R5900
> short loops, including those made with assembler macros in the kernel.

 I think that should be a GAS warning really (similarly to macros that 
expand to multiple instructions in a delay slot) as people ought to be 
allowed to do what they wish, and then `-Werror' can be used for code 
quality enforcement (and possibly disabled on a case-by-case basis).

> [ In theory, GAS could actually insert NOPs prior to the noreorder directive,
> to make the loop longer that six instructions, but GAS does not have that
> kind of capability. Another option that GCC prevents is to place a NOP in
> the delay slot. ]

 Well, GAS does have that capability, although of course it is not enabled 
for `noreorder' code.  For generated code I think however that usually it 
will be cheaper performance-wise if a non-trivial delay-slot instruction 
is never produced in the affected cases (i.e. a dummy NOP is always used).

> A reasonable fix for GCC is perhaps to update gcc/config/mips/mips.md to not
> make explicit use of the branch delay slot, as suggested by the patch below?
> Then GCC will emit
> 
> loop:	addiu	$5,$5,1
> 	addiu	$4,$4,1
> 	lb	$2,-1($5)
> 	sb	$2,-1($4)
> 	bne	$2,$3,loop
> 
> that GAS will adjust in the ELF object to
> 
>    4:	24a50001 	addiu	a1,a1,1
>    8:	24840001 	addiu	a0,a0,1
>    c:	80a2ffff 	lb	v0,-1(a1)
>   10:	a082ffff 	sb	v0,-1(a0)
>   14:	1443fffb 	bne	v0,v1,4
>   18:	00000000 	nop
> 
> where a NOP is placed in the delay slot to avoid the bug. For longer loops,
> this relies on GAS making appropriate use of the delay slot. I'm not sure if
> GAS is good at that, though?

 I'm sort-of surprised that GCC has produced `reorder' code here, making 
it rely on GAS for delay slot scheduling.  Have you used an unusual set of 
options that prevents GCC from making `noreorder' code, which as I recall 
is the usual default (not for the MIPS16 mode IIRC, as well as some 
obscure corner cases)?

>     On the R5900 short loops need to be fixed by inserting a NOP in the
>     branch delay slot.
>     
>     The short loop bug under certain conditions causes loops to execute
>     only once or twice.  We must ensure that the assembler never
>     generates loops that satisfy all of the following conditions:
>     
>     - a loop consists of less than or equal to six instructions
>       (including the branch delay slot);
>     - a loop contains only one conditional branch instruction at the end
>       of the loop;
>     - a loop does not contain any other branch or jump instructions;
>     - a branch delay slot of the loop is not NOP (EE 2.9 or later).
>     
>     We need to do this because of a hardware bug in the R5900 chip.
> 
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index e17b1d522f0..acd31a8960c 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -1091,6 +1091,7 @@
>  
>  (define_delay (and (eq_attr "type" "branch")
>  		   (not (match_test "TARGET_MIPS16"))
> +		   (not (match_test "TARGET_FIX_R5900"))
>  		   (eq_attr "branch_likely" "yes"))
>    [(eq_attr "can_delay" "yes")
>     (nil)
> @@ -1100,6 +1101,7 @@
>  ;; not annul on false.
>  (define_delay (and (eq_attr "type" "branch")
>  		   (not (match_test "TARGET_MIPS16"))
> +		   (not (match_test "TARGET_FIX_R5900"))
>  		   (ior (match_test "TARGET_CB_NEVER")
>  			(and (eq_attr "compact_form" "maybe")
>  			     (not (match_test "TARGET_CB_ALWAYS")))
> 

 I think you need to modify the default `can_delay' attribute definition 
instead (in the same way).  An improved future version might determine the 
exact conditions as noted in your proposed commit description, however I'd 
suggest making this simple change first.

 HTH,

  Maciej



More information about the Gcc-patches mailing list