This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
- From: Claudiu Zissulescu <Claudiu dot Zissulescu at synopsys dot com>
- To: Joern Wolfgang Rennecke <gnu at amylaar dot uk>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: "Francois dot Bedard at synopsys dot com" <Francois dot Bedard at synopsys dot com>, "jeremy dot bennett at embecosm dot com" <jeremy dot bennett at embecosm dot com>
- Date: Thu, 28 Apr 2016 14:11:48 +0000
- Subject: RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
- Authentication-results: sourceware.org; auth=none
- References: <1460990028-5718-1-git-send-email-claziss at synopsys dot com> <1460990028-5718-5-git-send-email-claziss at synopsys dot com> <5721F3A5 dot 6000404 at amylaar dot uk> <098ECE41A0A6114BB2A07F1EC238DE8966189EBB at de02wembxa dot internal dot synopsys dot com> <5721F6E2 dot 6060507 at amylaar dot uk>
Hi,
> Where exactly does the test go wrong?
The test which fails is this one:
TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
From the test file included in the patch.
> Can you show a trace of __eqdf2 with register values?
Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling:
[0x000002a2] 0xc000 K Z ld_s r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff *
[0x000002a4] 0xc101 K Z ld_s r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff *
[0x000002a6] 0xc202 K Z ld_s r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff *
[0x000002a8] 0xc303 K Z ld_s r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff *
[0x000002aa] 0x0aea0000 K Z bl 0x2e8 : (w0) r31 <= 0x000002ae *
[0x00000590] 0x091d00e1 K Z brne.d r1,r3,0x1c
[0x00000594] 0x2153050c K Z bmsk r12,r1,0x14 : (w0) r12 <= 0x000fffff *
[0x00000598] 0x200580be K Z or.f 0,r0,r2 *
[0x0000059c] 0x24cf1562 K N bset.ne r12,r12,0x15 : (w0) r12 <= 0x002fffff *
[0x000005a0] 0x2414904c K N add1.f r12,r12,r1 : (w0) r12 <= 0x000ffffd *
[0x000005a4] 0x7fe0 K C j_s.d [blink] *
[0x000005a6] 0x20cc8086 KD C cmp.cc r0,r2
For reference, the routine:
.global __eqdf2
.balign 4
HIDDEN_FUNC(__eqdf2)
/* Good performance as long as the difference in high word is
well predictable (as seen from the branch predictor). */
__eqdf2:
brne.d DBL0H,DBL1H,.Lhighdiff
bmsk r12,DBL0H,20
#ifndef __HS__
/* The next two instructions are required to recognize the FPX
NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
oposite to 0x7ff8_0000_0000_0000. */
or.f 0,DBL0L,DBL1L
bset.ne r12,r12,21
#endif /* __HS__ */
add1.f r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN. */
j_s.d [blink]
cmp.cc DBL0L,DBL1L
.balign 4
.Lhighdiff:
or r12,DBL0H,DBL1H
or.f 0,DBL0L,DBL1L
j_s.d [blink]
bmsk.eq.f r12,r12,30
ENDFUNC(__eqdf2)
All those results were collected using nsimfree.
Please let me know if you need more info,
Claudiu