61423 – Incorrect conversion from unsigned int to floating point

Bug 61423 - Incorrect conversion from unsigned int to floating point

Summary: Incorrect conversion from unsigned int to floating point

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	unknown

Importance:	P3 normal
Target Milestone:	4.8.4
Assignee:	Uroš Bizjak

URL:
Keywords:	wrong-code

Depends on:	61446
Blocks:
	Show dependency tree / graph

Reported:	2014-06-05 15:52 UTC by lvqcl.mail
Modified:	2014-06-18 20:03 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:	i?86--
Build:
Known to work:
Known to fail:
Last reconfirmed:	2014-06-06 00:00:00

Attachments
test program (333 bytes, text/plain) 2014-06-05 15:54 UTC, lvqcl.mail	Details
Proposed patch (931 bytes, patch) 2014-06-06 15:30 UTC, Uroš Bizjak	Details \| Diff
Testcase that fails in REE pass (29.14 KB, application/x-bzip) 2014-06-06 15:37 UTC, Uroš Bizjak	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description lvqcl.mail 2014-06-05 15:52:04 UTC

The attached program works incorrectly when compiled for i686 with '-O3 -msse2' or '-O1 -ftree-vectorize -msse2'.

GCC compiles the function

#define N 1024
static unsigned int A[N];
double func (void)
{
	unsigned int sum = 0;
	unsigned i;
	double t;

	for (i = 0; i < N; i++) sum += A[i];

	t = sum; /* uint32 -> double */
	return t;
}

into the following:

01:	pxor	%xmm0, %xmm0
02:	movl	$_A, %eax
03: L2:
04:	paddd	(%eax), %xmm0
05:	addl	$16, %eax
06:	cmpl	$_A+4096, %eax
07:	jne	L2
08:	movdqa	%xmm0, %xmm1
09:	subl	$28, %esp
10:	psrldq	$8, %xmm1
11:	paddd	%xmm1, %xmm0
12:	movdqa	%xmm0, %xmm1
13:	psrldq	$4, %xmm1
14:	paddd	%xmm1, %xmm0
15:	movq	%xmm0, 8(%esp)
16:	fildq	8(%esp)
17:	addl	$28, %esp
18:	ret

After the line 07: xmm0 contains four partial sums.
After the line 14: lower 4 bytes of xmm0 contain the total sum, the rest 12 bytes contain garbage.
Lines 15 and 16: *eight* bytes from xmm0 are stored in memory and then loaded into an FPU register.

According to the message from Robert Kausch ( http://lists.xiph.org/pipermail/flac-dev/2014-June/004723.html ) this bug exists since GCC 4.4.

Comment 1 lvqcl.mail 2014-06-05 15:54:16 UTC

Created attachment 32897 [details]
test program

Comment 2 Richard Biener 2014-06-06 08:07:37 UTC

Confirmed.  Works with -mfpmath=sse.  Initial RTL looks ok to me:

(insn 21 20 22 (set (reg:SI 99 [ stmp_sum_5.8 ])
        (vec_select:SI (reg:V4SI 98 [ vect_sum_5.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) -1
     (nil))

(insn 22 21 23 (parallel [
            (set (reg:DF 97 [ t ])
                (unsigned_float:DF (reg:SI 99 [ stmp_sum_5.8 ])))
            (clobber (mem/c:DI (plus:SI (reg/f:SI 78 virtual-stack-vars)
                        (const_int -8 [0xfffffffffffffff8])) [0  S8 A64]))
            (clobber (scratch:SI))
        ]) t.c:18 -1
     (nil))

(insn 23 22 24 (set (reg:DF 94 [ <retval> ])
        (reg:DF 97 [ t ])) t.c:19 -1
     (nil))

So it must be a bogus *floatunssidf2_1 pattern

(insn 22 21 28 4 (parallel [
            (set (reg:DF 8 st [orig:97 t ] [97])
                (unsigned_float:DF (reg:SI 21 xmm0 [orig:99 stmp_sum_5.8 ] [99])))
            (clobber (mem/c:DI (plus:SI (reg/f:SI 7 sp)
                        (const_int 8 [0x8])) [0  S8 A64]))
            (clobber (scratch:SI))
        ]) t.c:18 211 {*floatunssidf2_1}
     (nil))

split to

(insn 39 38 40 4 (set (mem/c:DI (plus:SI (reg/f:SI 7 sp)
                (const_int 8 [0x8])) [0  S8 A64])
        (reg:DI 21 xmm0 [orig:99 stmp_sum_5.8 ] [99])) t.c:18 89 {*movdi_internal}   
     (nil))
(insn 40 39 28 4 (set (reg:DF 8 st [orig:97 t ] [97])
        (float:DF (mem/c:DI (plus:SI (reg/f:SI 7 sp)
                    (const_int 8 [0x8])) [0  S8 A64]))) t.c:18 206 {*floatdidf2_i387}
     (nil))

note the use of a DImode memory but the missing zero-extend of xmm0:DI.

Comment 3 Uroš Bizjak 2014-06-06 10:28:04 UTC

Looking into it.

Comment 4 Uroš Bizjak 2014-06-06 15:30:40 UTC

Created attachment 32901 [details]
Proposed patch

This patch fies invalid code gen, but unfortunately uncovers a problem in REE pass and fails bootstrap when configured "--with-arch=core-avx-i --with-cpu=core-avx-i".

Comment 5 Uroš Bizjak 2014-06-06 15:37:49 UTC

Created attachment 32903 [details]
Testcase that fails in REE pass

This testcase fails with patched gcc in REE pass:

cc1 -O2 -m32 -march=corei7 libgcc2.i

/home/uros/gcc-svn/trunk/libgcc/libgcc2.c: In function ‘__fixunssfdi’:
/home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1492:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 54 11 47 2 (set (reg:DI 0 ax)
        (reg:DI 21 xmm0)) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1444 89 {*movdi_internal}
     (expr_list:REG_UNUSED (reg:DI 0 ax)
        (nil)))
/home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1492:1: internal compiler error: in copyprop_hardreg_forward_1, at regcprop.c:776
...

_split2 pass produces correct sequence:

(insn 46 11 47 2 (set (reg:DI 21 xmm0 [118])
        (zero_extend:DI (reg/v:SI 0 ax [orig:85 hi ] [85]))) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1452 133 {*zero_extendsidi2}
     (nil))
(insn 47 46 48 2 (set (mem/c:DI (reg/f:SI 7 sp) [0  S8 A64])
        (reg:DI 21 xmm0 [118])) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1452 89 {*movdi_internal}
     (nil))
(insn 48 47 13 2 (set (reg:DF 9 st(1) [orig:101 D.6895 ] [101])
        (float:DF (mem/c:DI (reg/f:SI 7 sp) [0  S8 A64]))) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1452 206 {*floatdidf2_i387}
     (nil))

and _ree pass converts this to:

(insn 11 45 54 2 (set (reg:DI 21 xmm0)
        (zero_extend:DI (reg:SI 2 cx [99]))) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1444 133 {*zero_extendsidi2}
     (nil))
(insn 54 11 47 2 (set (reg:DI 0 ax)
        (reg:DI 21 xmm0)) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1444 -1
     (nil))
(insn 47 54 48 2 (set (mem/c:DI (reg/f:SI 7 sp) [0  S8 A64])
        (reg:DI 21 xmm0 [118])) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1452 89 {*movdi_internal}
     (nil))
(insn 48 47 13 2 (set (reg:DF 9 st(1) [orig:101 D.6895 ] [101])
        (float:DF (mem/c:DI (reg/f:SI 7 sp) [0  S8 A64]))) /home/uros/gcc-svn/trunk/libgcc/libgcc2.c:1452 206 {*floatdidf2_i387}
     (nil))


Please note ivalid (insn 54).

Comment 6 Uroš Bizjak 2014-06-06 15:42:46 UTC

Jeff, can you please look at what goes wrong in REE pass?

Comment 7 uros 2014-06-06 17:45:42 UTC

Author: uros
Date: Fri Jun  6 17:45:10 2014
New Revision: 211321

URL: http://gcc.gnu.org/viewcvs?rev=211321&root=gcc&view=rev
Log:
	PR target/61423
	* config/i386/i386.md (*floatunssi<mode>2_i387_with_xmm): New
	define_insn_and_split pattern, merged from *floatunssi<mode>2_1
	and corresponding splitters.  Zero extend general register
	or memory input operand to XMM temporary.  Enable for
	TARGET_SSE2 and TARGET_INTER_UNIT_MOVES_TO_VEC only.
	(floatunssi<mode>2): Update expander predicate.

testsuite/ChangeLog:

	PR target/61423
	* gcc.target/i386/pr61423.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr61423.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.md
    trunk/gcc/testsuite/ChangeLog

Comment 8 Uroš Bizjak 2014-06-08 08:21:36 UTC

(In reply to Uroš Bizjak from comment #5)
> Created attachment 32903 [details]
> Testcase that fails in REE pass
> 
> This testcase fails with patched gcc in REE pass:

This is now PR61446.

Comment 9 Jakub Jelinek 2014-06-11 16:17:34 UTC

Shouldn't this patch be reverted until REE is fixed?

Comment 10 Jeffrey A. Law 2014-06-11 21:27:09 UTC

Either way will cause a failure.  This is #1 on my hit list after the regex problem that's affecting mysql.

Comment 11 Jeffrey A. Law 2014-06-12 21:20:35 UTC

I expect the REE bug to be fixed tomorrow AM (my time).  It's a trivial issue.

Comment 12 uros 2014-06-17 05:01:24 UTC

Author: uros
Date: Tue Jun 17 05:00:52 2014
New Revision: 211723

URL: https://gcc.gnu.org/viewcvs?rev=211723&root=gcc&view=rev
Log:
	Backport from mainline
	2014-06-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/61423
	* config/i386/i386.md (*floatunssi<mode>2_i387_with_xmm): New
	define_insn_and_split pattern, merged from *floatunssi<mode>2_1
	and corresponding splitters.  Zero extend general register
	or memory input operand to XMM temporary.  Enable for
	TARGET_SSE2 and TARGET_INTER_UNIT_MOVES_TO_VEC only.
	(floatunssi<mode>2): Update expander predicate.

testsuite/ChangeLog:

	Backport from mainline
	2014-06-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/61423
	* gcc.target/i386/pr61423.c: New test.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr61423.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/config/i386/i386.md
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog

Comment 13 uros 2014-06-18 20:02:08 UTC

Author: uros
Date: Wed Jun 18 20:01:37 2014
New Revision: 211803

URL: https://gcc.gnu.org/viewcvs?rev=211803&root=gcc&view=rev
Log:
	Backport from mainline
	2014-06-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/61423
	* config/i386/i386.md (*floatunssi<mode>2_i387_with_xmm): New
	define_insn_and_split pattern, merged from *floatunssi<mode>2_1
	and corresponding splitters.  Zero extend general register
	or memory input operand to XMM temporary.  Enable for
	TARGET_SSE2 and TARGET_INTER_UNIT_MOVES_TO_VEC only.
	(floatunssi<mode>2): Update expander predicate.

testsuite/ChangeLog:

	Backport from mainline
	2014-06-13  Ilya Enkovich  <ilya.enkovich@intel.com>

	PR rtl-optimization/61094
	PR rtl-optimization/61446
	* gcc.target/i386/pr61446.c : New.

	Backport from mainline
	2014-06-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/61423
	* gcc.target/i386/pr61423.c: New test.


Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr61423.c
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr61446.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/config/i386/i386.md
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog

Comment 14 Uroš Bizjak 2014-06-18 20:03:32 UTC

Fixed in all release branches.