This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch-86512]: Subnormal float support in armv7(with -msoft-float) for intrinsics
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: Nicolas Pitre <nico at fluxnic dot net>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Umesh Kalappa <umesh dot kalappa0 at gmail dot com>
- Cc: Richard Earnshaw <Richard dot Earnshaw at arm dot com>, nd <nd at arm dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
- Date: Fri, 27 Jul 2018 13:32:26 +0000
- Subject: Re: [Patch-86512]: Subnormal float support in armv7(with -msoft-float) for intrinsics
- References: <CAGfacvSoQyq-9PGFmEntmH5QHSMc2HpYsoJTSf1eAMPgZZK_EQ@mail.gmail.com>,<nycvar.YSQ.7.76.1807261213390.2327@knanqh.ubzr>
Hi Nicolas,
I think your patch doesn't quite work as expected:
@@ -238,9 +238,10 @@ LSYM(Lad_a):
movs ip, ip, lsl #1
adcs xl, xl, xl
adc xh, xh, xh
- tst xh, #0x00100000
- sub r4, r4, #1
- bne LSYM(Lad_e)
+ subs r4, r4, #1
+ do_it hs
+ tsths xh, #0x00100000
+ bhi LSYM(Lad_e)
If the exponent in r4 is zero, the carry bit will be clear, so we don't execute the tsths
and fallthrough (the denormal will be normalized and then denormalized again, but
that's so rare it doesn't matter really).
However if r4 is non-zero, the carry will be set, and the tsths will be executed. This
clears the carry and sets the Z flag based on bit 20. We will now also always
fallthrough rather than take the branch if bit 20 is non-zero. This may still give the
correct answer, however it would add considerable extra overhead... I think using
a cmp rather than tst would work.
Wilco