28826 – return (vector float) { a, a, b, b } generates unwanted MMX insns

Bug 28826 - return (vector float) { a, a, b, b } generates unwanted MMX insns

Summary: return (vector float) { a, a, b, b } generates unwanted MMX insns

Status:	RESOLVED DUPLICATE of bug 28825

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	4.2.0

Importance:	P3 minor
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2006-08-23 21:24 UTC by Stuart Hastings
Modified:	2006-08-23 21:28 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:	i786--
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Stuart Hastings 2006-08-23 21:24:44 UTC

+++ This bug was initially created as a clone of Bug #24073 +++

Take the following example:
#define vector __attribute__((vector_size(16)))

float a; float b;
vector float f(void) { return (vector float){ a, b, 0.0, 0.0}; }
---
Currently we get:
        subl    $12, %esp
        movss   _b, %xmm0
        movss   _a, %xmm1
        unpcklps        %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        xorl    %eax, %eax
        xorl    %edx, %edx
        movl    %eax, (%esp)
        movl    %edx, 4(%esp)
        xorps   %xmm1, %xmm1
        movlhps %xmm1, %xmm0
        addl    $12, %esp

------
We should be able to produce:
movss _b, %xmm0
movss _a, %xmm1
shufps 60, /*[0, 3, 3, 0]*/, %xmm1, %xmm0 // _a, 0, 0, _b
shufps 201, /*[3, 0, 2, 1]*/, %xmm0, %xmm0 // _a, _b, 0, 0

This is from Nathan Begeman.
================================================================
 ------- Comment #4 From Uros Bizjak  2005-09-27 11:41   -------

I think that following example wins the contest:

vector float f(void) { return (vector float){ a, a, b, b}; }

gcc -O2 -msse -fomit-frame-pointer

	subl	$28, %esp
	movss	a, %xmm0
	movss	%xmm0, 4(%esp)
	movss	b, %xmm0
	movd	4(%esp), %mm0
	punpckldq	%mm0, %mm0
	movss	%xmm0, 4(%esp)
	movq	%mm0, 16(%esp)
	movd	4(%esp), %mm0
	punpckldq	%mm0, %mm0
	movq	%mm0, 8(%esp)
	movlps	16(%esp), %xmm1
	movhps	8(%esp), %xmm1
	addl	$28, %esp
	movaps	%xmm1, %xmm0
	ret

Note the usage of MMX registers.


------- Comment #5 From Andrew Pinski 2005-09-27 14:33 -------

(In reply to comment #4)
> I think that following example wins the contest:
> 
> vector float f(void) { return (vector float){ a, a, b, b}; }

For this, it is a different bug.  The issue with the above is that ix86_expand_vector_init_duplicate check 
for mmx_okay is bad.
Currently, we have
      if (!mmx_ok && !TARGET_SSE)
but I if I change it to:
      if (!mmx_ok)
we get:
        movss   _a, %xmm0
        movss   _b, %xmm1
        unpcklps        %xmm0, %xmm0
        unpcklps        %xmm1, %xmm1
        movlhps %xmm1, %xmm0
Which looks ok to me.  That testcase should be opened into another bug as it is obviously wrong.

=====================================================================
Cloned from 24073 to track the MMX insn issue; the original 24073 problem is a performance issue.

Comment 1 Andrew Pinski 2006-08-23 21:28:03 UTC


*** This bug has been marked as a duplicate of 28825 ***