This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
RE: Help w/ PR61538?
- From: Matthew Fortune <Matthew dot Fortune at imgtec dot com>
- To: Joshua Kinard <kumba at gentoo dot org>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Mon, 28 Jul 2014 08:41:50 +0000
- Subject: RE: Help w/ PR61538?
- Authentication-results: sourceware.org; auth=none
- References: <53B8C5F1 dot 9060005 at gentoo dot org> <53D5CA09 dot 7020604 at gentoo dot org>
Hi Joshua,
I know very little about this area but I'll try and offer some advice anyway...
> On 07/05/2014 23:43, Joshua Kinard wrote:
> > Hi,
> >
> > I filed PR61538 about two weeks ago, regarding gcc-4.8.x and up not
> > compiling a g++/pthreads-linked app correctly on SGI R1x000-based systems
> > (Octane, Onyx2), running Linux. Running the subsequently-compiled
> > application simply hangs in a futex syscall until terminated via Ctrl+C.
> I
> > suspect it's a double-locking bug of some design, as evidenced by strace
> > showing two consecutive syscall()'s w/ 0x108e passed as the syscall #
> (4238
> > or futex on o32 MIPS), but I am stumped as to what else I can do to debug
> it
> > and help fix it.
> >
> [snip]
> > Full details:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538
>
> So I've spent the last few weeks bisecting the gcc tree, and I've narrowed
> down the set of commits that appear to have introduced this problem:
>
> 1. 39a8c5eaded1e5771a941c56a49ca0a5e9c5eca0 * config/mips/mips.c
> (mips_emit_pre_atomic_barrier_p,)
This is the prime candidate for introducing the issue.
> 2. 974f0a74e2116143b88d8cea8e1dd5a9c18ef96c * config/mips/constraints.md
> (ZR): New constraint.
Unlikely
> 3. 0f8e46b16a53c02d7255dcd6b6e9b5bc7f8ec953 * config/mips/mips.c
> (mips_process_sync_loop): Emit cmp result only if
Possible but unlikely still
> 4. 30c3c4427521f96fb58b6e1debb86da4f113f06f * emit-rtl.c
> (need_atomic_barrier_p): New function.
Seems unlikely
>
> There's a build failure somewhere in the middle of there that is blocking me
> from figuring out which specific one is the cause, but they all appear to be
> related anyways. All four were added on 2012-06-20.
>
> When I took a git checkout from 2012-06-26 and reverted those four commits,
> I was able to compile glibc-2.19 and get a working "sln" binary. I am
> unable to easily test the C++ side because I built the checkouts in my
> $HOME, and it's too risky to try and shoehorn one of them in as the system
> compiler. However, I think the C++ issue is also fixed by reverting the
> four, as that also involved hanging in Linux futex syscalls.
Here is a wild guess at the problem... I think the workaround for R10000 to
use branch likely instead of delay slot branches is ending up annulling
an instruction that is required for certain atomic operations. This is an
entirely untested theory (and patch) but can you see if this fixes the issue
you are seeing:
@@ -13014,7 +13023,8 @@ mips_process_sync_loop (rtx insn, rtx *operands)
mips_multi_copy_insn (tmp3_insn);
mips_multi_set_operand (mips_multi_last_index (), 0, newval);
}
- else if (!(required_oldval && cmp))
+ else if (!(required_oldval && cmp)
+ || mips_branch_likely)
mips_multi_add_insn ("nop", NULL);
/* CMP = 1 -- either standalone or in a delay slot. */
I suspect I can weave that in more naturally but can you tell me if that
fixes the problem first.
Regards,
Matthew