This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, ARM] Subregs of VFP registers in big-endian mode


On Sat, Oct 20, 2012 at 4:38 AM, Julian Brown <julian@codesourcery.com> wrote:
> Hi,
>
> Quite a few tests fail for big-endian multilibs which use VFP
> instructions at present. One reason for many of these is glaringly
> obvious once you notice it: for D registers interpreted as two S
> registers, the lower-numbered register is always the less-significant
> part of the value, and the higher-numbered register the
> more-significant -- regardless of the endianness the processor is
> running in.
>
> However, for big-endian mode, when DFmode values are represented in
> memory (or indeed core registers), the opposite is true. So, a subreg
> expression such as the following will work fine on core registers (or
> e.g. pseudos assigned to stack slots):
>
> (subreg:SI (reg:DF) 0)
>
> but, when applied to a VFP register Dn, it should be resolved to the
> hard register S(n*2+1). At present though, it resolves to S(n*2) -- i.e.
> the wrong half of the value (for WORDS_BIG_ENDIAN, such a subreg should
> be the most-significant part of the value). For the relatively few cases
> where DFmode values are interpreted as a pair of (integer) words, this
> means that wrong code is generated.
>
> My feeling is that implementing a "proper" solution to this problem is
> probably impractical -- the closest existing macros to control
> behaviour aren't sufficient for this case:
>
> * FLOAT_WORDS_BIG_ENDIAN only refers to memory layout, which is correct
>   as is it.
>
> * REG_WORDS_BIG_ENDIAN controls whether values are stored in big-endian
>   order in registers, but refers to *all* registers. We only want to
>   change the behaviour for the VFP registers. Defining a new macro
>   FLOAT_REG_WORDS_BIG_ENDIAN wouldn't do, because the behaviour would
>   differ depending on the hard register under observation: that seems
>   like too much to ask of generic machinery in the middle-end.
>
> So, the attached patch just avoids the problem, by pretending that
> greater-than-word-size values in VFP registers, in big-endian mode, are
> opaque and cannot be subreg'ed. In practice, for at least the test case
> I looked at, this isn't as much of a pessimisation as you might expect
> -- the value in question might already be stored in core registers
> (e.g. for function arguments with -mfloat-abi=softfp), so can be
> retrieved directly from those rather than via memory.
>
> This is the testsuite delta for current FSF mainline, with multilibs
> adjusted to build for little/big-endian, and using options
> "-mbig-endian -mfloat-abi=softfp -mfpu=vfpv3" for testing:
>
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/g++.sum:g++.dg/torture/type-generic-1.C  -Os  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/copysign1.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/ieee/mzero6.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr35456.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O1
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fno-use-linker-plugin -flto-partition=none
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -fomit-frame-pointer
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -O3 -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Og -g
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.c-torture/execute/pr44683.c execution,  -Os
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O1  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -fomit-frame-pointer  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -O3 -g  execution test
> FAIL -> PASS: be-code-on-qemu/gcc.sum:gcc.dg/torture/type-generic-1.c  -Os  execution test
>
> OK for mainline, or any comments? (I've included the multilib tweaks I
> used in the attached patch for reference, though I'm not proposing to
> apply those.)

I also tested this on GCC 4.7.0 with armeb-linux-gnueabi defaulting to
hardfloat ABI and fixes a lot of failures there too.

Thanks,
Andrew Pinski


>
> Thanks,
>
> Julian
>
> ChangeLog
>
>     gcc/
>     * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing
>     VFP D registers in big-endian mode.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]