This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop
- From: Andrew Bennett <Andrew dot Bennett at imgtec dot com>
- To: "Maciej W. Rozycki" <macro at codesourcery dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 18 Nov 2014 14:34:11 +0000
- Subject: RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop
- Authentication-results: sourceware.org; auth=none
- References: <0DA23CC379F5F945ACB41CF394B9827720F331A8 at LEMAIL01 dot le dot imgtec dot org> <alpine dot DEB dot 1 dot 10 dot 1411181345380 dot 2881 at tp dot orcam dot me dot uk>
> -----Original Message-----
> From: Maciej W. Rozycki [mailto:macro@codesourcery.com]
> Sent: 18 November 2014 13:48
> To: Andrew Bennett
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] If using branch likelies in MIPS sync code fill the delay
> slot with a nop
>
> On Tue, 18 Nov 2014, Andrew Bennett wrote:
>
> > The atomic-compare-exchange-3.c and atomic-op-3.c tests are failing when
> > using the -mfix-r10000 option. This is due to the fact that the delay
> > slot of the branch instruction that checks if the atomic operation was
> > not successful can be filled with an operation that returns the output
> > result. For the branch likely case this operation can not go in the
> > delay slot because it wont execute when the atomic operation was
> > successful and therefore the wrong result will be returned. This patch
> > forces a nop to be placed in the delay slot if a branch likely is used.
> > The fix is as simple as possible; it does cause a second nop to be
> introduced
> > after the branch instruction in the branch likely case. As this option is
> > rarely used, it makes more sense to keep the code readable than trying to
> > optimise it.
>
> Can you please be a bit more elaborate on how the wrong code sequence
> looks like, why it is produced like that and how your fix changes it?
> Without seeing examples of code generated it is very hard to imagine
> what is really going on here.
Ok if we compile the following cut-down atomic-op-3.c test case with
-mfix-r10000.
extern void abort(void);
int v, count, res;
int
main ()
{
v = 0;
count = 1;
if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1)
abort ();
return 0;
}
The following assembly code is produced for the atomic operation:
.set noat
sync # 15 sync_new_addsi/2 [length = 24]
1:
ll $3,0($4)
addu $1,$3,$2
sc $1,0($4)
beql $1,$0,1b
addu $3,$3,$2
sync
.set at
Notice here that the addu is in the delay slot of the branch likely instruction.
This is computing the value that exists in the memory location after the atomic
operation has completed. Because it is a branch likely instruction
this will only run when the atomic operation fails, and hence when it is
successful it will not return the correct value.
The offending code is in mips_process_sync_loop:
mips_multi_add_insn ("beq%?\t%0,%.,1b", at, NULL);
/* if (INSN1 != MOVE && INSN1 != LI) NEWVAL = $TMP3 [delay slot]. */
if (insn1 != SYNC_INSN1_MOVE && insn1 != SYNC_INSN1_LI && tmp3 != newval)
{
mips_multi_copy_insn (tmp3_insn);
mips_multi_set_operand (mips_multi_last_index (), 0, newval);
}
else if (!(required_oldval && cmp))
mips_multi_add_insn ("nop", NULL);
/* CMP = 1 -- either standalone or in a delay slot. */
if (required_oldval && cmp)
mips_multi_add_insn ("li\t%0,1", cmp, NULL);
For the branch likely case the delay slot could be filled either with the
operation to compute the value that exists in memory after the atomic operation
has completed; an operation to return if the atomic operation is successful
or not; or a nop. The first two operations should not be in the delay slot
of the branch if it is a branch likely because they will only run if the atomic
operation was unsuccessful.
My fix places a nop in the delay slot of the branch likely instruction
by using the %~ output operation. This then causes the sync code for the
previous example to be correct:
.set noat
sync # 15 sync_new_addsi/2 [length = 24]
1:
ll $3,0($4)
addu $1,$3,$2
sc $1,0($4)
beql $1,$0,1b
nop
addu $3,$3,$2
sync
.set at
Regards,
Andrew