[PATCH] loading float member of parameter stored via int registers
Jiufu Guo
guojiufu@linux.ibm.com
Tue Jan 3 03:28:38 GMT 2023
Hi,
Andrew Pinski <pinskia@gmail.com> writes:
> On Thu, Dec 29, 2022 at 11:45 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>>
>> Hi!
>>
>> On Fri, Dec 30, 2022 at 10:22:31AM +0800, Jiufu Guo wrote:
>> > Considering the limitations of CSE, I try to find other places
>> > to handle this issue, and notice DSE can optimize below code:
>> > "[sfp:DI]=x:DI ; y:SI=[sfp:DI]" to "y:SI=x:DI#0".
>> >
>> > So, I drafted a patch to update DSE to handle DI->DF/SF.
>> > The patch updates "extract_low_bits" to get mode change
>> > with subreg.
>> >
>> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
>> > index b12b0e000c2..5e36331082c 100644
>> > --- a/gcc/expmed.cc
>> > +++ b/gcc/expmed.cc
>> > @@ -2439,7 +2439,10 @@ extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>> >
>> > if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>> > return NULL_RTX;
>> > - if (!targetm.modes_tieable_p (int_mode, mode))
>> > + if (!targetm.modes_tieable_p (int_mode, mode)
>> > + && !(known_le (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode))
>> > + && GET_MODE_CLASS (mode) == MODE_FLOAT
>> > + && GET_MODE_CLASS (src_mode) == MODE_INT))
>> > return NULL_RTX;
>> >
>> > src = gen_lowpart (src_int_mode, src);
>>
>> Ah! This simply shows rs6000_modes_tieable_p is decidedly non-optimal:
>> it does not allow tying a scalar float to anything else. No such thing
>> is required, or good apparently. I wonder why we have such restrictions
>> at all in rs6000; is it just unfortunate history, was it good at one
>> point in time?
>
> The documentation for TARGET_MODES_TIEABLE_P says the following:
> If TARGET_HARD_REGNO_MODE_OK (r, mode1) and TARGET_HARD_REGNO_MODE_OK
> (r, mode2) are always the same for any r, then TARGET_MODES_TIEABLE_P
> (mode1, mode2) should be true. If they differ for any r, you should
> define this hook to return false unless some other mechanism ensures
> the accessibility of the value in a narrower mode.
>
> even though rs6000_hard_regno_mode_ok_uncached's comment has the following:
> /* The float registers (except for VSX vector modes) can only hold floating
> modes and DImode. */
>
> TARGET_P8_VECTOR and TARGET_P9_VECTOR has special cased different modes now:
> if (TARGET_P8_VECTOR && (mode == SImode))
> return 1;
>
> if (TARGET_P9_VECTOR && (mode == QImode || mode == HImode))
> return 1;
> Which I suspect that means rs6000_modes_tieable_p should return true
> for SImode and SFmode if TARGET_P8_VECTOR is true. Likewise for
> TARGET_P9_VECTOR and SFmode and QImode/HImode too.
>
Thanks for your great comments!
modes_tieable_p is invoked by a few places besides extract_low_bits, so
updating this hook to relax the restriction may benefit more passes.
We may update modes_tieable_p for more cases as possible.
A hacked patch for "float vs. int" is listed at the end of this mail.
While back to the issue of this PR: optimize float loading which is
stored from the int register. DSE works more on basicblock, so updating
modes_tieable_p (or extract_low_bits) can not handle some cases like:
double __attribute__ ((noipa)) foo_df (DF arg, int flag)
{
if (flag == 2)
return arg.a[3];
return 0.0;
}
I'm thinking a way to handle this case.
BR,
Jeff (Jiufu)
>
> Thanks,
> Andrew Pinski
>
>>
>>
>> Segher
(To be refined.)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index b3a609f3aa3..8088a608be6 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1959,6 +1959,17 @@ rs6000_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
static bool
rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
+
+ if ((GET_MODE_CLASS (mode1) == MODE_FLOAT
+ && (GET_MODE_SIZE (mode2) == UNITS_PER_FP_WORD
+ || (TARGET_P8_VECTOR && (mode2 == SImode))
+ || (TARGET_P9_VECTOR && (mode2 == QImode || mode2 == HImode))))
+ || (GET_MODE_CLASS (mode2) == MODE_FLOAT
+ && (GET_MODE_SIZE (mode1) == UNITS_PER_FP_WORD
+ || (TARGET_P8_VECTOR && (mode1 == SImode))
+ || (TARGET_P9_VECTOR && (mode1 == QImode || mode1 == HImode)))))
+ return true;
+
if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
|| mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
return mode1 == mode2;
-------
More information about the Gcc-patches
mailing list