Bug 24076 - (vector char){x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x} code gen is not that good
(vector char){x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x} code gen is not...
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: target
4.1.0
: P2 minor
: 4.2.0
Assigned To: Not yet assigned to anyone
: missed-optimization, ssemmx
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-27 04:45 UTC by Andrew Pinski
Modified: 2006-04-17 18:39 UTC (History)
1 user (show)

See Also:
Host:
Target: i786-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-09-27 05:33:09


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2005-09-27 04:45:54 UTC
Take the following code:
#define vector __attribute__((__vector_size__(16)))
vector char bar (char x) {
return (vector char){
    x, x, x, x, x, x, x, x,
    x, x, x, x, x, x, x, x
  };
}
----
We currently produce crappy code:
        pushl   %edx
        movzbw  8(%esp), %ax
        movl    %eax, %edx
        sall    $8, %edx
        orl     %eax, %edx
        movzwl  %dx, %edx
        movl    %edx, %eax
        sall    $16, %eax
        orl     %edx, %eax
        movl    %eax, (%esp)
        movd    (%esp), %xmm1
        pshufd  $0, %xmm1, %xmm0
        popl    %eax
        ret
----
The issue looks like we are expanding this crappy:
insn 12 11 13 (set (reg:V16QI 62)
        (const_vector:V16QI [
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
            ])) -1 (nil)
    (nil))

(insn 13 12 14 (parallel [
            (set (reg:HI 63)
                (zero_extend:HI (reg/v:QI 59 [ x ])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 14 13 15 (parallel [
            (set (reg:HI 64)
                (ashift:HI (reg:HI 63)
                    (const_int 8 [0x8])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))
(insn 16 15 17 (set (reg:SI 66) 
        (zero_extend:SI (reg:HI 64))) -1 (nil)
    (nil))

(insn 17 16 18 (parallel [
            (set (reg:SI 67)
                (ashift:SI (reg:SI 66)
                    (const_int 16 [0x10])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 18 17 19 (parallel [
            (set (reg:SI 67)
                (ior:SI (reg:SI 67)
                    (reg:SI 66)))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 19 18 20 (set (reg:V4SI 68)
        (vec_duplicate:V4SI (reg:SI 67))) -1 (nil)
    (nil))

(insn 20 19 21 (set (reg:V8HI 65)
        (subreg:V8HI (reg:V4SI 68) 0)) -1 (nil)
    (nil))

(insn 21 20 22 (set (reg:V16QI 62)
        (subreg:V16QI (reg:V8HI 65) 0)) -1 (nil)
    (nil))
Comment 1 Andrew Pinski 2005-09-27 04:47:04 UTC
A better way to optimize this was shown in
http://gcc.gnu.org/ml/gcc-patches/2005-09/msg00546.html
Comment 2 Andrew Pinski 2005-09-27 04:52:26 UTC
From the looks of it, (vector short){ x, x, x, x, x, x, x, x } has the same issue too.
Comment 3 Andrew Pinski 2005-09-27 05:06:17 UTC
This is an issue in ix86_expand_vector_init.
Comment 4 Andrew Pinski 2005-09-27 05:33:08 UTC
The bug for this code is because ix86_expand_vector_init_duplicate does widen in GPRs first and then 
in into the vector register.

Mainly:
      /* Replicate the value once into the next wider mode and recurse.  */


I am working on at least V16QImode right now.
Comment 5 Andrew Pinski 2005-12-21 19:49:58 UTC
Not working on this any more.
Comment 6 Roger Sayle 2006-04-16 21:47:03 UTC
Subject: Bug 24076

Author: sayle
Date: Sun Apr 16 21:46:59 2006
New Revision: 112990

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112990
Log:
2006-04-15  Roger Sayle  <roger@eyesopen.com>
	    Andrew Pinski  <pinskia@gcc.gnu.org>
	    Dale Johannesen  <dalej@apple.com>

	PR target/24076
	* config/i386/i386.c (ix86_expand_vector_init_duplicate): Add
	special case code to implement V8HImode and V16QImode with SSE2.

	* gcc.target/i386/vecinit-3.c: New testcase.
	* gcc.target/i386/vecinit-4.c: Likewise.
	* gcc.target/i386/sse-18.c: Likewise.
	* gcc.target/i386/sse-19.c: Likewise.


Added:
    trunk/gcc/testsuite/gcc.target/i386/sse-18.c
    trunk/gcc/testsuite/gcc.target/i386/sse-19.c
    trunk/gcc/testsuite/gcc.target/i386/vecinit-3.c
    trunk/gcc/testsuite/gcc.target/i386/vecinit-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog

Comment 7 Andrew Pinski 2006-04-17 18:39:25 UTC
Fixed.