[PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.

Thu Apr 28 14:12:00 GMT 2016

Hi,

> Where exactly does the test go wrong?

The test which fails is this one: 
	TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
From the test file included in the patch.

> Can you show a trace of __eqdf2 with register values?

Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling:

[0x000002a2] 0xc000                 K Z    ld_s           r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff *
[0x000002a4] 0xc101                 K Z    ld_s           r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff *
[0x000002a6] 0xc202                 K Z    ld_s           r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff *
[0x000002a8] 0xc303                 K Z    ld_s           r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff *
[0x000002aa] 0x0aea0000             K Z    bl             0x2e8 : (w0) r31 <= 0x000002ae *
[0x00000590] 0x091d00e1             K Z    brne.d         r1,r3,0x1c
[0x00000594] 0x2153050c             K Z    bmsk           r12,r1,0x14 : (w0) r12 <= 0x000fffff *
[0x00000598] 0x200580be             K Z    or.f           0,r0,r2 *
[0x0000059c] 0x24cf1562             K  N   bset.ne        r12,r12,0x15 : (w0) r12 <= 0x002fffff *
[0x000005a0] 0x2414904c             K  N   add1.f         r12,r12,r1 : (w0) r12 <= 0x000ffffd *
[0x000005a4] 0x7fe0                 K   C  j_s.d          [blink] *
[0x000005a6] 0x20cc8086             KD  C  cmp.cc         r0,r2

For reference, the routine:

	.global __eqdf2
	.balign 4
	HIDDEN_FUNC(__eqdf2)
	/* Good performance as long as the difference in high word is
	   well predictable (as seen from the branch predictor).  */
__eqdf2:
	brne.d DBL0H,DBL1H,.Lhighdiff
	bmsk    r12,DBL0H,20
#ifndef __HS__
	/* The next two instructions are required to recognize the FPX
	NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
	oposite to 0x7ff8_0000_0000_0000.  */
	or.f    0,DBL0L,DBL1L
	bset.ne r12,r12,21
#endif /* __HS__ */
	add1.f	r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
	j_s.d	[blink]
	cmp.cc	DBL0L,DBL1L
	.balign 4
.Lhighdiff:
	or	r12,DBL0H,DBL1H
	or.f	0,DBL0L,DBL1L
	j_s.d	[blink]
	bmsk.eq.f r12,r12,30
	ENDFUNC(__eqdf2)

All those results were collected using nsimfree.

Please let me know if you need more info,
Claudiu