This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg
- From: Matthew Fortune <Matthew dot Fortune at imgtec dot com>
- To: Eric Botcazou <ebotcazou at adacore dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Vladimir Makarov <vmakarov at redhat dot com>, Robert Suchanek <Robert dot Suchanek at imgtec dot com>, "Moore, Catherine (Catherine_Moore at mentor dot com)" <Catherine_Moore at mentor dot com>
- Date: Mon, 20 Feb 2017 10:02:52 +0000
- Subject: RE: [PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg
- Authentication-results: sourceware.org; auth=none
- References: <6D39441BF12EF246A7ABCE6654B0235380B5CE8C@hhmail02.hh.imgtec.org> <1751039.zrf2WNPc4a@polaris>
Hi Eric,
Any thoughts on this?
Thanks,
Matthew
> Sorry for the slow reply, been away for a few days....
>
> Eric Botcazou <ebotcazou@adacore.com> writes:
> > > This patch is a minimal change to prevent (subreg(mem)) from being
> > > simplified to use the outer mode for WORD_REGISTER_OPERATIONS.
> > > There is high probability of refining and/or re-implementing this
> > > for GCC 8 but such a change would be too invasive. This change at
> > > least ensures correctness but may prevent simplification of some
> acceptable cases.
> >
> > This one causes:
> >
> > +FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -fomit-frame-pointer -
> > funroll-
> > loops -fpeel-loops -ftracer -finline-functions (test for excess
> > errors)
> > +WARNING: gcc.dg/torture/builtin-complex-1.c -O3 -fomit-frame-
> pointer
> > -
> > funroll-loops -fpeel-loops -ftracer -finline-functions compilation
> > failed to produce executable
> > +FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -g (test for excess
> > errors)
> > +WARNING: gcc.dg/torture/builtin-complex-1.c -O3 -g compilation
> > failed to
> > produce executable
> > +WARNING: program timed out.
> > +WARNING: program timed out.
> >
> > on SPARC 32-bit, i.e. LRA hangs. Reduced testcase attached, compile
> > at
> > -O3 with a cc1 configured for sparc-sun-solaris2.10.
> >
> > > gcc/
> > > PR target/78660
> > > * lra-constraints.c (simplify_operand_subreg): Handle
> > > WORD_REGISTER_OPERATIONS targets.
> > > ---
> > > gcc/lra-constraints.c | 17 ++++++++++++-----
> > > 1 file changed, 12 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index
> > > 66ff2bb..484a70d 100644
> > > --- a/gcc/lra-constraints.c
> > > +++ b/gcc/lra-constraints.c
> > > @@ -1541,11 +1541,18 @@ simplify_operand_subreg (int nop,
> > > machine_mode
> > > reg_mode) subregs as we don't substitute such equiv memory (see
> > > processing equivalences in function lra_constraints) and because for
> > > spilled pseudos we allocate stack memory enough for the biggest
> > > - corresponding paradoxical subreg. */
> > > - if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > > - && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > > - || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > > - && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg))))
> > > + corresponding paradoxical subreg.
> > > +
> > > + However, never simplify a (subreg (mem ...)) for
> > > + WORD_REGISTER_OPERATIONS targets as this may lead to loading
> > junk
> > > + data into a register when the inner is narrower than outer or
> > > + missing important data from memory when the inner is wider
> > than
> > > + outer. */
> > > + if (!WORD_REGISTER_OPERATIONS
> > > + && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > > + && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > > + || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > > + && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN
> > (reg)))))
> > > return true;
> > >
> > > *curr_id->operand_loc[nop] = operand;
> >
> > I think that we might need:
> >
> > if (!(GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (innermode)
> > && WORD_REGISTER_OPERATIONS)
> > && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)))))
> > return true;
> >
> > i.e. we force the reloading only for paradoxical subregs.
>
> Maybe, though I'm not entirely convinced. For WORD_REGISTER_OPERATIONS
> both paradoxical and normal SUBREGs can be a problem as the inner mode
> in both cases can be used elsewhere for a reload making the content of
> the spill slot wrong either in the subreg reload or the ordinary reload
> elsewhere. However, the condition can be tightened; I should not have
> made it so simplistic I guess. I.e. both modes must be no wider than a
> word and must be different precision to force an inner reload.
>
> Adding that check would fix this case for SPARC and should be fine for
> MIPS but I need to wait for a bootstrap build to be sure.
>
> I don't really understand why LRA can't reload this for SPARC though as
> I'm not sure there is any guarantee provided to backends that some
> SUBREGs will be reloaded using their outer mode. If there is such a
> guarantee then it would be much easier to reason about this logic but as
> it stands I suspect we are having to tweak LRA to cope with assumptions
> made in various targets that happen to have held true (and I have no
> doubt that MIPS has some of these as well especially in terms of the
> FP/GP register usage with float and int modes.) All being well we can
> capture such assumptions and formalise them so we ensure they hold true
> (or modify backends appropriately I guess).
>
> The condition would look like this, What do you think?
>
> if (!(GET_MODE_PRECISION (mode) != GET_MODE_PRECISION
> (innermode)
> && GET_MODE_SIZE (mode) <= UNITS_PER_WORD
> && GET_MODE_SIZE (innermode) <= UNITS_PER_WORD
> && WORD_REGISTER_OPERATIONS)
> && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN
> (reg)))))
> return true;
>
> Thanks,
> Matthew