This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFH] subreg of a vector without going through memory

From: Marc Glisse <marc dot glisse at inria dot fr>
To: gcc-patches at gcc dot gnu dot org
Date: Sun, 4 Nov 2012 11:42:47 +0100 (CET)
Subject: Re: [RFH] subreg of a vector without going through memory
References: <alpine.DEB.2.02.1211040959520.5576@stedding.saclay.inria.fr>

On Sun, 4 Nov 2012, Marc Glisse wrote:

Hello,

trying to make some progress on PR 53101, I wrote the attached patch
(it might be completely wrong for big endian, I don't know)
(it is also missing a check that it isn't a paradoxical subreg)

* simplify-rtx.c (simplify_subreg): For vectors, create a VEC_SELECT.

However, when I compile this code on x86_64:

typedef double v4 __attribute__((vector_size(32)));
typedef double v2 __attribute__((vector_size(16)));
v2 f(v4 x){
 return *(v2*)&x;
}

I see in the *.combine dump:

[...]
Trying 6 -> 7:
Successfully matched this instruction:
(set (reg:V2DF 60 [ <retval> ])
   (vec_select:V2DF (reg/v:V4DF 61 [ x ])
       (parallel [
               (const_int 0 [0])
               (const_int 1 [0x1])
           ])))
rejecting combination of insns 6 and 7
original costs 4 + 16 = 20
replacement cost 32

This cost comes from the x86 target:

    case VEC_SELECT:
    case VEC_CONCAT:
    case VEC_MERGE:
    case VEC_DUPLICATE:
      /* ??? Assume all of these vector manipulation patterns are
         recognizable.  In which case they all pretty much have the
         same cost.  */
     *total = cost->fabs;
     return true;

If the canonical form of a subvector is vec_select, I guess the cost needs updating, and if it is subreg, the target should learn how to handle it properly?

[...]
(note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v:V4DF 61 [ x ])
       (reg:V4DF 21 xmm0 [ x ])) v.cc:3 1123 {*movv4df_internal}
    (expr_list:REG_DEAD (reg:V4DF 21 xmm0 [ x ])
       (nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:OI 63 [ x ])
       (subreg:OI (reg/v:V4DF 61 [ x ]) 0)) v.cc:4 60 {*movoi_internal_avx}
    (expr_list:REG_DEAD (reg/v:V4DF 61 [ x ])
       (nil)))
(insn 7 6 11 2 (set (reg:V2DF 60 [ <retval> ])
       (subreg:V2DF (reg:OI 63 [ x ]) 0)) v.cc:4 1124 {*movv2df_internal}
    (expr_list:REG_DEAD (reg:OI 63 [ x ])
       (nil)))
(insn 11 7 14 2 (set (reg/i:V2DF 21 xmm0)
       (reg:V2DF 60 [ <retval> ])) v.cc:5 1124 {*movv2df_internal}
    (expr_list:REG_DEAD (reg:V2DF 60 [ <retval> ])
       (nil)))
(insn 14 11 0 2 (use (reg/i:V2DF 21 xmm0)) v.cc:5 -1
    (nil))

I am surprised by that high replacement cost that prevents the change. Is my approach wrong? Is there an issue with the evaluation of costs?

The approach was suggested by Richard B:
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00197.html


--
Marc Glisse

References:
- [RFH] subreg of a vector without going through memory
  - From: Marc Glisse

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]