This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFA: MIPS paired single support


Thanks for taking the time to address my comments.

James E Wilson <wilson@specifixinc.com> writes:
> On Fri, 2004-08-20 at 01:44, Richard Sandiford wrote:
>> I know that you're somewhat limited by gcc's infrastructure here, but
>> there's so much cut-&-paste in the patch, both in the .md patterns and
>> the C code itself.  Please try to reduce it if you can.  I've made some
>> specific suggestions below, but please look for other ways to cut it
>> down too.
>
> Most of the duplication is in the conditional patterns, where we have a
> separate UNSPEC code for every conditional operation.

Yes, the UNSPECs are part of it, but by "cut-&-paste" I was also
thinking of things like:

   MIPS_BUILTIN_ANY_C_F_PS,
   MIPS_BUILTIN_UPPER_C_F_PS,
   MIPS_BUILTIN_LOWER_C_F_PS,
   MIPS_BUILTIN_ALL_C_F_PS,
   MIPS_BUILTIN_ANY_C_UN_PS,
   MIPS_BUILTIN_UPPER_C_UN_PS,
   MIPS_BUILTIN_LOWER_C_UN_PS,
   MIPS_BUILTIN_ALL_C_UN_PS,
   [...]

since I refuse to believe you (well, OK, MIPS) typed in each line
individually ;).  It would certainly be possible to make things like
this more readable using preprocessor macros.  But as I said later,
my main suggestion is to get rid of this particular enum anyway...

> We could divide the number of patterns by 8 or so if we can put the
> comparison code inside the UNSPEC instead of encoding it into the
> UNSPEC number.  I will try taking a look at this, but this will
> probably be the last thing I try to work on.

Don't feel you need to.  I'll try to deal with the 8->1 fold myself
when macroising the file.

>> So do your vec_merge patterns accurately describe the operation for
>> both big and little endian?
>
> I believe it is currently impossible to write a vec_merge pattern that
> is correct for both big and little endian.  I am willing to work on
> this, as I have already worked on the other vector operators, but I
> think forcing me to fix this before it goes in is an unacceptable
> burden.  For the moment, the mips port is as correct as any other port,
> as all ports (except maybe the SH) have broken vector support when it
> comes to endianness.

Sorry, but I don't like this at all.  One of your main reasons for using
things like vec_merge rather than UNSPECs was that it was possible to
describe the operation accurately with standard rtl codes.  If you're
saying that it _isn't_ possible to describe some of the patterns accurately
with standard rtl codes, please use UNSPECs for those patterns until the
situation changes.

>> We're running out of uppercase EXTRA_CONSTRAINTS.  Perhaps it would
>> be a good idea to define a two-letter constraint instead?  E.g. "YG".
>> Then we can reuse the "Y" prefix for other constants (vector or
>> otherwise) that might be needed in future.
>
> Or maybe something like GV, to indicate that this is a FP zero vector
> constant?  I will work on this.

I still prefer 'YG'.  I'd rather not redefine an existing constraint
like 'G' to be "sometimes one-letter, sometimes two-letter".  It would
just get too confusing (for me, anyway ;).  The advantage of commandeering
'Y' is that the MIPS port doesn't use it all right now, so can require every
'Y' constraint to be a multi-letter one.

>> A lot of the standard V2SF patterns are obviously the same as the
>> SFmode/DFmode ones, but with mechanical changes.  Perhaps it would
>> be better to keep them together?  If the macro patch ever does go in,
>> I'd imagine that they would be combined into a single template.
>
> OK.  I can re-sort the simple arith/move patterns into the mips.md
> file.  I noticed in your mips.md <mode> patches that you were not
> requiring me to add <mode> support to the vector patch,

Right.

> so I am not planning on trying at the moment.

No problem.

> One thing that occurs to me is that you asked me to drop the V2SF mode
> attribute after version 1 of the patch.  However, since the mode
> attribute now comes from <mode> that means we either need to re-add
> the V2SF mode attribute, or else add yet another define_mode_macro
> which will map V2SFmode to "V2SF" add all of the other modes to
> themselves.

I assume you mean <MODE> (for the "mode" attribute) here.  Like I say,
I'll deal with this myself.

And "yet another" seem a bit uncalled-for.  If we add a mode macro for
SF/DF/V2SF, we'll need a mode attribute for the fmt (s, d, ps) and,
yes, an attribute for the (set_attr "mode" ...).  I still claim that
this is neater than copying and pasting each arithmetic pattern and
making the _exact same_ changes by hand.

>> More importantly, the type of the fourth alternative is wrong for m<-Y,
>> which is a GPR store, not an FPR store.
>
> This isn't a bug in the vector patch.

Yes it is ;)  It's a new pattern, ergo a new bug.

> This was copied from the movsf/movdf patterns in the mips.md file
> which have the exact same bug.  I see 3 patterns in the mips.md file
> that are broken,
>
> This is however my bug.  I added it when I split store into store and
> fpstore.  Maybe I can fix this with a separate patch?  I don't want to
> make the vector patch any bigger than it already is.

Feel free to deal with mips.md bit separately, but I don't see any
reason why the vector pattern shouldn't be right from the word go.
Yes, it'll make the patch bigger, but only by a few characters.
(Same number of lines. ;)

>> + (define_insn "*ldxc1v2sf"
>> +   [(set_attr "type"	"fpidxload")
>> +    (set_attr "mode"	"SF")
>> +    (set_attr "length"   "4")])
>> Strange formatting of set_attrs.  I'd like to move away from having
>> tabulated attributes, so I'd be happy with a single space between
>> the name and value.
>
> Again, this was just copied from the mips.md file.

Sorry, ignore this.  I was confused by the tabbing/space difference
between the "type"/"mode" and "length" lines.

>> If you're going to give descriptive names to the the patterns (which I
>> agree is a good thing), please do it for both the Pmode == SImode and
>> Pmode == DImode versions, not just the former.
>
> This is a style issue.  If a pattern doesn't have a name, then the gen*
> programs will create one by using the name of the previous pattern with
> a name, and add an offset.  Thus the SImode ldxc1 pattern is "ldxc1v2sf"
> and the DImode one is "ldxc1v2sf+1".

Not in recent versions of gcc.  Unnamed define_insns get named after the
file and line number of the pattern definition (see read-rtl.c).  "+n" names
are only used for things like define_splits, which have no named at all at
the rtl level.

This means that if you have an SImode pattern "*foo" and a DImode
pattern with no name, the SImode version will appear as "*foo" and the
DImode one will appear as "*mips.md:line".  So although I haven't tried
building your patch...

> My goal here wasn't to add a name to every pattern, but just to add
> enough names so that every pattern would end up with a sensible
> looking name.  This makes the rtl output more readable, and also makes
> the ChangeLog entries more readable.  Thus there is no need for a name
> on the DImode ldxc1v2sf pattern, as ldxc1c2sf+1 is a perfectly fine
> name for it.

...I doubt that you'll find the DImode pattern is actually named
ldxc1c2sf+1 at all.

> If you want separate names, then I can certainly do that.  Done.

Thanks.

>> + (define_insn "mips_pul_ps"
>>...
>> +   [(set_attr "type"	"fadd")
>> Why fmadd?  This seems more like an fmove insn to me.  Same with
>> the vecinit patterns, among others.
>
> I used fmove in my patch.  MIPS used fadd in their patch.  I don't know
> why, but it didn't seem important.  At the moment, the only target for
> which I have looked at timing info is SB-1, and for that target, it
> doesn't make a difference whether we use fmove or fadd.  The timing is
> the same.  So at this point, I think either choice is arbitrary, and can
> be fixed later if necessary when we have more info.
>
> For the moment, I am willing to use the arbitrary choice that mips
> used.  If you want a different arbitrary choice, let me know.

Yes ;).  Please only use "fadd" for operations that actually do addition.
Please use "fmove" for instructions that move things about and "fcvt" for
instructions that perform some kind of numerical conversion.

> Part of the reason why I used explicit INTVAL checks is that this avoids
> the need to create yet another constraint letter.  We have a constraint
> letter for zero, but not for 1.  I see the uses of const_1_operand in
> the mips.md file use empty constraint strings.  I will try that.

Thanks.  Should work.

>> + ;; ??? This is only generated if we perform a vector operation that has to be
>> + ;; emulated.  There is no other way to get a vector mode bitfield extract
>> + ;; currently.
>> Don't understand the comment.  Could you provide a few more details?
>
> Jan Hubicka added vec_init/vec_set/vec_extract patterns in January. 
> While a good idea, he didn't actually add good testcases for them, or
> much support for generating them.  vec_init is the easy one.  If you
> write
>   a = (v2sf) { a, b};
[...snipped...]
> Someday it may be useful, so we want to define it anyways.

Thanks for the explanation.  I still disagree with the conclusion though.
There doesn't seem to be any way to test whether the pattern actually works,
and there's no guarantee that any future extensions will want to use the
pattern in its current form.  I object to adding dead code.

On the other hand, I can certainly compromise on having a commented-out
pattern that says what we think the pattern would look like.  Would that
be OK with you?

>> Similarly here.  Given that we _don't_ disable the vec_init pattern,
>> does the comment mean that we never actually use vec_setv2sf?  If so,
>> please don't add it ;)
>
> But someday someone will improve gcc to the point where vec_set does get
> used, and then the mips port will be broken and we will never know it. 
> It is better to define it now, while we know it is needed, rather than
> to wait until we have an invisible bug.  Defining it does no harm, and
> will ensure that we benefit from future gcc improvements.

Same comment applies here.

>> Wouldn't it be better to use a define_insn_and_split that creates two
>> mips_c_f_ps patterns after reload?  It should allow better scheduling
>> and (much more minor) means you don't need to add the '%V' or '%v'
>> operand formats.
>
> FYI this is MIPS code not my code.

Well, yes, but you're asking for the patch to be applied, so I don't
think it's unreasonable to ask you to justify it.

> I looked at this, and didn't see a convincing reason to split them. 
> This pattern is added for an obscure reason.  The MIPS-3D ASE has a
> "branch on 4 FP condition codes" instruction, but there is no
> instruction that sets 4 FP condition codes.  So we cheat and add 4-wide
> compare patterns that expand to 2 ps instructions.  I was concerned
> about creating additional patterns for something that seemed obscure.  I
> was also concerned about getting the register allocation right.  The
> split after reload trick didn't occur to me at the time, but I didn't
> really spend much time looking at this.
>
> Maybe I can look at fixing this after the patch gets accepted?

I'd really rather see this in the first drop.

>> + (define_insn "mips_recip1_s"
>> Wouldn't the new frdiv be more appropriate than frsqrt?
>
> These are actually very different instructions from the reciprocal
> divide instructions.  These are reciprocal divide step.  I would argue
> that frdiv is just as wrong as frsqrt, but either way, it is a pretty
> arbitrary choice at this point.  For SB-1, recip divide is 12 cycles,
> but recip divide step 1 is 4 cycles and recip divide step 2 is 8 cycles,
> so we actually need two new type attributes to get this right.  I didn't
> think this extra complication needed to be part of part of the initial
> vector patch, which is already too big and too complicated.

Disagree ;)  It's easy to add extra "type" attributes.  The sb1.md
change can be separate if you like...

>> ! 	info->fpr_p = (named && (type == 0 || FLOAT_TYPE_P (type)
>> ! 				 || TYPE_MODE (type) == V2SFmode));
>> are exactly what I'd like to avoid.  Please make the o32/o64/n32/n64
>> cases depend on TYPE where given.
>
> I find this ambiguous.  Exactly what are you trying to avoid here?  This
> is depending on the TYPE, specifically, it is depending on TYPE_MODE
> (type). If you are objecting to the use of a mode check, then I don't
> understand that either, because the new code also has mode checks.  In
> any case, this gets rewritten not only because of the mips ABI patch but
> also because of my tree.h patch to fix folding of vector floats. 
> FLOAT_TYPE_P now includes vector types, so the check above is now
> unnecessary.  Also, we have a new VECTOR_FLOAT_TYPE_P that I can use
> instead of the TYPE_MODE check.  That should address your concerns.

It does, thanks.

You say you don't understand why I don't like type TYPE_MODE check,
but these two conditions are definitely not the same:

    TYPE_MODE (type) == V2SFmode
    VECTOR_FLOAT_TYPE_P (type)

Or at least, the equivalents for plain floats aren't, and I assume
the same is true of vectors too.  We've already been bitten by code
that checked TYPE_MODE rather than TYPE_CODE, and consequently treated
things like "struct { float f; }" in the same way as "float f" when the
ABIs says they should be handled differently.

>> +   else if (letter == 'Y')
>> +     {
>> +       register int regnum;
>>...
>> +     }
>> For the love of God, please don't add new instances of "register" ;)
>> This is one of the cut-&-paste bits I was complaining about above.
>
> Well, the "register" was there in the original code...

Yes, but this is new code ;)

> I could pull out these 6 sections of code into a small subroutine. 
> Would that be OK.

Sure would, thanks.

> This will make the code a little slower because of extra overhead, but
> will eliminate the duplication.

"Premature optimisation is the root of all evil" ;)

Besides, gcc can now inline small static functions automatically, so I
refuse to believe that the performance is an issue here.

>> Do we really need the MIPS_BUILTIN_ enum?  Why not just use the mips_bdesc[]
>> index as the builtin number, use that to look up the CODE_FOR_*, and key off
>> the CODE_FOR_* instead?
>> 
>> If that isn't possible for some reason, how about putting the definitions
>> in a *.def file that can be read both here and in mips.h?  Then there will
>> be less chance of them getting out of sync.
>
> This is more code from MIPS, not from me.  It looks like the point of
> keeping them in the same order is to avoid the search loops that other
> ports use.  See for instance the rs6000_expand_builtin function which
> does a linear search of the bdesc table to find the right entry.  The
> mips port just does a table lookup.  If you really want to avoid the
> ordering problem, them I will add in the loops to do a linear search.  I
> think the .def file idea is unnecessary busy work.

You didn't seem to answer my main suggestion though, which is that the
MIPS_BUILTIN_s shouldn't be needed.  It's just an ordinary 0-based
enumeration.  For example, instead of:

  switch (fcode)
    {
    /* Two Operands.  */
    case MIPS_BUILTIN_PLL_PS:
    case MIPS_BUILTIN_PUL_PS:
    ...

why not just use:

  switch (mips_bdesc[fcode].code)
    {
    /* Two Operands.  */
    case CODE_FOR_pll_ps:
    case CODE_FOR_pul_ps:
    ...

(although, like I say, I'd rather see this particular use of "code"
depend on "ftype" instead.)

OK, it might mean a few checks against ARRAY_SIZE (mips_bdesc), but
it seems a lot neater than a big enumeration.

>> Rather than have such a humungous switch statement, I'd prefer to see
>> the expansion depend on ftype, even if that means splitting ftype up
>> a bit.
>
> Added to my list.  This probably requires adding lots of linear search
> loops such as is done in the rs6000 file.

I still don't see why you can't use:

  switch (mips_bdesc[fcode].ftype)

>>         if (TARGET_SINGLE_FLOAT)					\
>> ! 	builtin_define ("__mips_single_float");			\
>> !       else if (TARGET_PAIRED_SINGLE_FLOAT)			\
>> ! 	builtin_define ("__mips_paired_single_float");		\
>> No need for the "else".
>
> I added the else because it makes the code marginally faster.

"Premature optimisation is the root of all evil" ;)

> It isn't possible for both to be true, so there is no point in testing
> TARGET_PAIRED_SINGLE_FLOAT if we know that TARGET_SINGLE_FLOAT is true. 

Why not?  Like I said before when objecting to the TARGET_DOUBLE_FLOAT
checks, it's entirely conceivable that some future arch will only have
hardware support for single precision ops, but will be able to handle
paired single-precision values.

> But since you don't like it, it is gone.

Thanks.

>> + @itemx -mpaired-single
>> + @opindex mpaired-single
>> + Enable support for the paired single instructions.
>> 
>> Not very helpful as documentation goes ;)  Please at least say what
>> the user can expect wrt the vector extensions.  There'll also need
>> to be some documentation of the builtin functions themselves.
>
> There is a separate section of the gcc manual that documents the vector
> support.  This is in the extend.texi file.
>
> MIPS already has documentation on the builtin functions.  I think it is
> a reasonable convention to rely on the processor vendor to specify these
> things.  This is what we do with x86 builtin functions for instance.

Disagree.  We're talking gcc builtins here, and gcc-specific vector
extensions such as "attribute ((mode (v2sf)))".

I'm not expecting you to document what the MIPS .ps instructions do,
redocument GCC's target-independent vector support, or anything like that.
But I think it's reasonable to expect at least _some_ indication of what
(in gcc terms) "support for paired single instructions" means.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]