This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Matthew Fortune <Matthew dot Fortune at imgtec dot com>
- Cc: "Joseph Myers \(joseph\ at codesourcery dot com\)" <joseph at codesourcery dot com>, "macro\ at codesourcery dot com" <macro at codesourcery dot com>, "Moore\, Catherine \(Catherine_Moore\ at mentor dot com\)" <Catherine_Moore at mentor dot com>, Rich Fuhler <Rich dot Fuhler at imgtec dot com>, "'gcc-patches\ at gcc dot gnu dot org' \(gcc-patches\ at gcc dot gnu dot org\)" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 21 May 2014 23:09:27 +0100
- Subject: Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
- Authentication-results: sourceware.org; auth=none
- References: <6D39441BF12EF246A7ABCE6654B023535233AE at LEMAIL01 dot le dot imgtec dot org> <87r43qlshb dot fsf at talisman dot default> <6D39441BF12EF246A7ABCE6654B02353539412 at LEMAIL01 dot le dot imgtec dot org>
Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
>> > *) Because GCC can be built to have mfpxx or mfp64 as the default option
>> > the ASM_SPEC has to handle these specially such that they are not
>> > passed in conjunction with -msingle-float. Depending on how all this
>> > option handling pans out then this may also need to account for
>> > msoft-float as well. It is an error to have -msoft-float and -mfp64 in
>> > the assembler.
>>
>> The assembler and GCC shouldn't treat the options differently though.
>> Either it should be OK for both or neither.
>
> I wasn't sure if you were applying this rule to options set by --with- at
> configure time as well as user supplied options. If I move this logic
> to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
> -msingle-float is given then that will fix the issue too. Is that OK?
--with-foo is just a way of setting the default -mfoo option when no
explicit -mfoo option is given. We shouldn't generally distinguish
between them.
>> > @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info,
>> const CUMULATIVE_ARGS *cum,
>> > || SCALAR_FLOAT_TYPE_P (type)
>> > || VECTOR_FLOAT_TYPE_P (type))
>> > && (GET_MODE_CLASS (mode) == MODE_FLOAT
>> > - || mode == V2SFmode)
>> > + || (TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode))
>> > && GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
>> > break;
>>
>> This looks odd. We shouldn't have V2SF values if there's no ISA support
>> for them.
>
> True. This is a safety measure against future vector support. I/We wish to
> completely restrict this ABI extension to the paired single-float
> hardware feature. This detail is easy to forget, I would like to keep
> the check but you get final call of course.
But the problem is that the change gives a specific behaviour for
!TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode, which at the moment
should be a nonsensical condition. I want to avoid defining an ABI
for something that shouldn't happen.
I'd be happy with an assert like:
gcc_assert (TARGET_PAIRED_SINGLE_FLOAT || mode != V2SFmode);
instead.
>> > @@ -5636,7 +5636,7 @@ mips_return_fpr_pair (enum machine_mode mode,
>> > {
>> > int inc;
>> >
>> > - inc = (TARGET_NEWABI ? 2 : MAX_FPRS_PER_FMT);
>> > + inc = ((TARGET_NEWABI || mips_abi == ABI_32) ? 2 : MAX_FPRS_PER_FMT);
>>
>> Formatting nit: no extra brackets here.
>
> Is this a no brackets at all case, or brackets on the condition? There are
> both styles in GCC but it is difficult to tell which is most common/correct.
Sorry, I meant just the new brackets.
>> > @@ -6508,13 +6508,27 @@ mips_output_64bit_xfer (char direction, unsigned
>> int gpreg, unsigned int fpreg)
>> > if (TARGET_64BIT)
>> > fprintf (asm_out_file, "\tdm%cc1\t%s,%s\n", direction,
>> > reg_names[gpreg], reg_names[fpreg]);
>> > - else if (TARGET_FLOAT64)
>> > + else if (ISA_HAS_MXHC1)
>> > {
>> > fprintf (asm_out_file, "\tm%cc1\t%s,%s\n", direction,
>> > reg_names[gpreg + TARGET_BIG_ENDIAN], reg_names[fpreg]);
>> > fprintf (asm_out_file, "\tm%chc1\t%s,%s\n", direction,
>> > reg_names[gpreg + TARGET_LITTLE_ENDIAN], reg_names[fpreg]);
>> > }
>> > + else if (TARGET_FLOATXX && direction == 't')
>> > + {
>> > + /* Use the argument save area to move via memory. */
>> > + fprintf (asm_out_file, "\tsw\t%s,0($sp)\n", reg_names[gpreg]);
>> > + fprintf (asm_out_file, "\tsw\t%s,4($sp)\n", reg_names[gpreg + 1]);
>> > + fprintf (asm_out_file, "\tldc1\t%s,0($sp)\n", reg_names[fpreg]);
>> > + }
>> > + else if (TARGET_FLOATXX && direction == 'f')
>> > + {
>> > + /* Use the argument save area to move via memory. */
>> > + fprintf (asm_out_file, "\tsdc1\t%s,0($sp)\n", reg_names[fpreg]);
>> > + fprintf (asm_out_file, "\tlw\t%s,0($sp)\n", reg_names[gpreg]);
>> > + fprintf (asm_out_file, "\tlw\t%s,4($sp)\n", reg_names[gpreg + 1]);
>> > + }
>>
>> The argument save area might be in use. E.g. if an argument register
>> gets spilled, we'll generally try to spill it to the save area rather
>> than create a new stack slot for it.
>>
>> This case should always be handled via SECONDARY_MEMORY_NEEDED.
>
> It is handled via SECONDARY_MEMORY_NEEDED for normal code generation. This
> function needs a better name! It is solely used as part of generating a
> mips16 stub and as such the argument save area is fair game as far as I can
> see, this code is essentially pre-prologue or post-epilogue of the callee.
>
> Shall I rename the function (and related functions which are solely
> applicable to mips16 stubs) to make it clear?
I think the name's OK. I'd just forgotten how the function was used.
So yeah, this looks good after all.
>> > @@ -12202,7 +12247,8 @@ mips_secondary_reload_class (enum reg_class
>> rclass,
>> > return NO_REGS;
>> >
>> > /* Otherwise, we need to reload through an integer register. */
>> > - return GR_REGS;
>> > + if (regno >= 0)
>> > + return GR_REGS;
>> > }
>> > if (FP_REG_P (regno))
>> > return reg_class_subset_p (rclass, GR_REGS) ? NO_REGS : GR_REGS;
>>
>> Why's this change needed? Although I assume it's dead code if you tested
>> against LRA.
>
> It is an optimisation actually (and this code is used by LRA). Without it
> We end up with reload registers for floating point values being reloaded
> via GR_REGS even though they can be stored to memory directly if a
> FP_REG was used. This results in extra moves between GP and FP. The test
> cases will fail if this code is removed (call-clobbered-3 and
> call-clobbered-5).
Ah, OK. In that case I think we should instead change:
if (MEM_P (x)
&& (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8))
/* In this case we can use lwc1, swc1, ldc1 or sdc1. We'll use
pairs of lwc1s and swc1s if ldc1 and sdc1 are not supported. */
return NO_REGS;
to:
if (MEM_P (x) || regno < 0)
{
/* In this case we can use combinations of LWC1, SWC1, LDC1 and
SDC1. */
gcc_checking_assert (GET_MODE_SIZE (mode) % 4 == 0);
return NO_REGS;
}
Please do it as a separate patch though.
>> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> > new file mode 100644
>> > index 0000000..f47cdda
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> > @@ -0,0 +1,21 @@
>> > +/* Check that we handle call-clobbered FPRs correctly. */
>> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
>> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -ffixed-f1 -
>> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -ffixed-f8
>> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -
>> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -ffixed-f20 -
>> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-f28 -ffixed-f30" } */
>> > +
>> > +void bar (void);
>> > +float a;
>> > +float
>> > +foo ()
>> > +{
>> > + float b = a + 1.0f;
>> > + bar();
>> > + return b;
>> > +}
>> > +/* { dg-final { scan-assembler-times "lwc1" 2 } } */
>> > +/* { dg-final { scan-assembler-not "swc1" } } */
>> > +/* { dg-final { scan-assembler-times "sdc1" 2 } } */
>> > +/* { dg-final { scan-assembler-not "mtc" } } */
>> > +/* { dg-final { scan-assembler-not "mfc" } } */
>> > +/* { dg-final { scan-assembler-not "mthc" } } */
>> > +/* { dg-final { scan-assembler-not "mfhc" } } */
>>
>> Why require sdc1 here? Would Chao-Ying's patch make this use SWC1 and LWC1
>> exclusively?
>
> The SDC1 instructions are for callee-saved registers. I'll add the
> check for two corresponding LDC1 instructions in the epilogue though
> since I've noticed them being missing.
I'm probably being dense, sorry, but why is SDC1 needed for -mfp32
-modd-spreg? Can't we just save and restore the odd register?
That's technically more correct too, since we shouldn't really touch
-ffixed registers.
>> > +#elif __mips_fpr == 0
>> > +#define MOVE_DF_BYTE0t sw $4, 0($29); sw $5, 4($29); ldc1 $f12, 0($29)
>> > +#define MOVE_DF_BYTE0f sdc1 $f12, 0($29); lw $4, 0($29); lw $5, 4($29)
>> > +#define MOVE_DF_BYTE0(D) MOVE_DF_BYTE0##D
>> > +#define MOVE_DF_BYTE8t sw $6, 8($29); sw $7, 12($29); ldc1 $f14, 8($29)
>> > +#define MOVE_DF_BYTE8f sdc1 $f14, 8($29); lw $6, 8($29); lw $7, 12($29)
>> > +#define MOVE_DF_BYTE8(D) MOVE_DF_BYTE8##D
>> > +#define MOVE_DF_RETt(T) sw $2, 0($29); sw $3, 4($29); DELAYt (T, ldc1
>> $f0, 0($29))
>> > +#define MOVE_DF_RETf(T) sdc1 $f0, 0($29); lw $2, 0($29); DELAYf (T, lw
>> $3, 4($29))
>> > +#define MOVE_DF_RET(D, T) MOVE_DF_RET##D(T)
>> > +#define MOVE_DC_RETt(T) sw $4, 8($29); sw $5, 12($29); ldc1 $f2, 8($29);
>> MOVE_DF_RETt(T)
>> > +#define MOVE_DC_RETf(T) sdc1 $f2, 8($29); lw $4, 8($29); lw $5, 12($29);
>> MOVE_DF_RETf(T)
>> > +#define MOVE_DC_RET(D, T) MOVE_DF_RET##D(T)
>> > #elif defined(__MIPSEB__)
>> > /* FPRs are little-endian. */
>> > #define MOVE_DF_BYTE0(D) m##D##c1 $4,$f13; m##D##c1 $5,$f12
>>
>> Is this always safe? CALL_STUB_RET in particular uses a special ABI
>> rather than the standard one, so I don't think we can assume that
>> it's safe to clobber the parameter save area.
>
> A fine question! I am 99% on the argument side. I will re-check the return
> side but I am not aware of anything which uses the argument save area as
> part of the return sequence.
I agree it's fine with the original order created by the epilogue expander.
I was just worried what later optimisations like sched2 might do.
But yeah, on reflection I was probably being overly paranoid. We handle
the return case like a normal call (contrary to what I thought) so it
wouldn't be valid for GCC to move uses of the save space across it.
Thanks,
Richard