This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: SH optimized software floating point routines

From: Christian Bruel <christian dot bruel at st dot com>
To: Joern Rennecke <joern dot rennecke at embecosm dot com>
Cc: Kaz Kojima <kkojima at rr dot iij4u dot or dot jp>, "Naveen dot S at kpitcummins dot com" <Naveen dot S at kpitcummins dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, "Prafulla dot Thakare at kpitcummins dot com" <Prafulla dot Thakare at kpitcummins dot com>
Date: Fri, 23 Jul 2010 11:31:44 +0200
Subject: Re: SH optimized software floating point routines
References: <371569CBCFB2E745B891DBB88B2DFDDD19DC264F56@KCINPUNHJCMS01.kpit.com> <20100717092859.dkxjsdzg0okk8o4c-nzlynne@webmail.spamcop.net> <20100718205838.l36v4trp1c04gk88-nzlynne@webmail.spamcop.net> <20100720.223513.44767183.kkojima@rr.iij4u.or.jp> <4C46D911.4020508@st.com> <4C47F189.3050802@st.com> <4C484E64.70105@st.com> <20100722121431.e1bub50bk088o040-nzlynne@webmail.spamcop.net>

Joern Rennecke wrote:

Quoting Christian Bruel <christian.bruel@st.com>:
Using the ieee-sf.S + this patch
OK
Is this only a proof-of-concept, because you only change the ne[sd]f2 implementation?

I changed also the unordered comparison patterns. (cmpunsf_i1, cmpundf_i1). But yes, the other functions that would need the same kind of check would be unordsf2, and all the comparisons (gtsf2, gesf2f...) for floats and doubles. But I will only consider those after/if we all agree that this needs to be done instead of keeping the current QNaN only restrictions.

And you go out of your way to only accept a restricted

set of values.

This hold for the original optimized implementation as well, for example I don't think that 0x7f800001 was caught. In fact implementing correctly the isnan check without restricted set of value makes the original discussion pointless, since the Q/S bits are a subpart of all possible codings, with any fractional part != 0.

Plus, the overuse of the arithmetic unit hurts SH4-100 /

SH4-200 instruction pairing.

>

AFAICT you need only one cycle penalty, in the check_nan path:

GLOBAL(nesf2):
         /* If the raw values are unequal, the result is unequal, unless
            both values are +-zero.
            If the raw values are equal, the result is equal, unless
            the values are NaN.  */
         cmp/eq  r4,r5
         mov.l   LOCAL(inf2),r1
         bt/s     LOCAL(check_nan)
         mov     r4,r0
         or      r5,r0
         rts
         add     r0,r0
LOCAL(check_nan):
         add     r0,r0
         cmp/hi  r1,r0
         rts
         movt    r0
         .balign 4
LOCAL(inf2):
         .long 0xff000000

You could even save four bytes by putting the check_nan label into the delay slot, but I'm not sure if that'll discomfit any branch prediction mechanism.

Thanks a lot of this one, It should fix the original problem on the restricted set of values as well. The cmpund patterns fix should probably have a similar checks.

Disclaimer: I've not tested this code.

For the DFmode case, what about NaNs denoted by the low word, e.g.
0x7ff00000 000000001 ?

If so, the DFmode code could become something like this:

GLOBAL(nedf2):
         cmp/eq  DBL0L,DBL1L
         mov.l   LOCAL(inf2),r1
         bf LOCAL(ne)
         cmp/eq  DBL0H,DBL1H
         bt/s    LOCAL(check_nan)
         mov     DBL0H,r0
         or      DBL1H,r0

         add     r0,r0
         rts
         or      DBL0L,r0
LOCAL(check_nan):
         tst     DBL0L,DBL0L
         add     r0,r0
         subc    r1,r0
         mov     #-1,r0
         rts
         negc    r0,r0
LOCAL(ne):
         rts
         mov #1,r0
         .balign 4
LOCAL(inf2):
         .long 0xffe00000


> For an actual patch, you need to use the SL* macros from
> config/sh/lib1funcs.h because the SH1 does not have delayed branches.

OK, thanks

References:
- RE: SH optimized software floating point routines
  - From: Naveen H. S
- RE: SH optimized software floating point routines
  - From: Joern Rennecke
- RE: SH optimized software floating point routines
  - From: Joern Rennecke
- Re: SH optimized software floating point routines
  - From: Kaz Kojima
- Re: SH optimized software floating point routines
  - From: Christian Bruel
- Re: SH optimized software floating point routines
  - From: Christian Bruel
- Re: SH optimized software floating point routines
  - From: Christian Bruel
- Re: SH optimized software floating point routines
  - From: Joern Rennecke

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]