[Bug target/28764] [4.2 Regression] libjava build failure on sh4

Sat Dec 23 21:25:00 GMT 2006

------- Comment #18 from ubizjak at gmail dot com  2006-12-23 21:24 -------
(In reply to comment #12)

> As far as I can see, the i387 mode switching is already completely broken,
> because it treats the different modes of a single mode-switchable entity
> as separate entities.

NO, it is _NOT_ broken by any stretch of imagination!

Perhaps something could be writtenn in a more fancy way, but there is a reason,
why we have separate entities for x87. Please note, that at mode switch point,
we insert code that calculates mode word and stores the result in memory. This
value is then used at the point, where mode is briefly switched for the insn to
operate in desired mode.

So, consider this testcase:
--cut here--
double test(double *a, int x)
{
  int i;
  double z = 0.0;

  for (i = 0; i < x; i++)
    z += floor(a[i]) + ceil(a[i]);

  return z;
}
--cut here--

This gets compiled by current approach (-O2 -ffast-math) into:

        fnstcw  -6(%ebp)
        xorl    %edx, %edx
        fldz
        movzwl  -6(%ebp), %eax
        movb    $4, %ah
        movw    %ax, -10(%ebp)
        movzwl  -6(%ebp), %eax
        movb    $8, %ah
        movw    %ax, -8(%ebp)
        .p2align 4,,7
.L5:
        fldl    (%ebx,%edx,8)
        addl    $1, %edx
        fld     %st(0)
        cmpl    %ecx, %edx
        fldcw   -8(%ebp)
        frndint
        fldcw   -6(%ebp)
        fxch    %st(1)
        fldcw   -10(%ebp)
        frndint
        fldcw   -6(%ebp)
        faddp   %st, %st(1)
        faddp   %st, %st(1)
        jne     .L5

Note that mode word calculation is pushed out of the loop.

I actualy considered an idea that implemented proposed
"several-modes-for-one-entity" approach for x87. However, modes are switched
_inside_ the loop, so by this approach mode calculation code also stays inside:

.L5:
        fldl    (%ebx,%edx,8)
        addl    $1, %edx
        fnstcw  -6(%ebp)
        fld     %st(0)
        cmpl    %ecx, %edx
        movzwl  -6(%ebp), %eax
        movb    $8, %ah
        movw    %ax, -8(%ebp)
        movzwl  -6(%ebp), %eax
        fldcw   -8(%ebp)
        frndint
        fldcw   -6(%ebp)
        fxch    %st(1)
        movb    $4, %ah
        movw    %ax, -10(%ebp)
        fldcw   -10(%ebp)
        frndint
        fldcw   -6(%ebp)
        faddp   %st, %st(1)
        faddp   %st, %st(1)
        jne     .L5

Now, which code is preferred?

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28764