Bug 34435 - [4.3 Regression] SSE2 intrinsics - emmintrin with optimisations off and type conversion error
Summary: [4.3 Regression] SSE2 intrinsics - emmintrin with optimisations off and type ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.0
: P1 normal
Target Milestone: 4.3.0
Assignee: Uroš Bizjak
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: patch, rejects-valid
Depends on:
Blocks:
 
Reported: 2007-12-11 17:35 UTC by lo
Modified: 2007-12-13 18:20 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-12-12 21:39:57


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description lo 2007-12-11 17:35:37 UTC
Using snapshot: gcc-4.3-20071109

Problem code follows:
///////
#include <emmintrin.h>

class Vec {
    __m128i vec;
public:
    Vec(int mm) {
        vec = _mm_set1_epi16(mm);
    }
    operator __m128i() const {
        return vec;
    }
};

int main() {
    _mm_shuffle_epi32(Vec(5), _MM_SHUFFLE(3,3,3,3));  // error
}
///////

This compiles fine with e.g. -O2, but with optimisations off, gcc reports "error: can't convert value to a vector".

This seems to be because a macro version of _mm_shuffle_epi32 is used when optimisations are off, and the type conversion from Vec to __mm128i can't happen for the first parameter.
Comment 1 Richard Biener 2007-12-12 09:55:10 UTC
Without optimization main() preprocesses to

int main() {
    ((__m128i)__builtin_ia32_pshufd ((__v4si)Vec(5), (((3) << 6) | ((3) << 4) | ((3) << 2) | (3))));
}


with optimization we get instead

int main() {
    _mm_shuffle_epi32(Vec(5), (((3) << 6) | ((3) << 4) | ((3) << 2) | (3)));
}

because with optimization we use an inline function instead of a macro.

Uros - it was you who changed that (and I see it may not be too easy to
make all parties happy here).  My suggestion is to force the same "argument"
type by doing

#define _mm_shuffle_epi32(__A, __B) \
  ((__m128i)__builtin_ia32_pshufd ((__v4si)(__m128i)__A, (int)__B))

instead of

#define _mm_shuffle_epi32(__A, __B) \
  ((__m128i)__builtin_ia32_pshufd ((__v4si)__A, __B))

to match the prototype of _mm_shuffle_epi32 which reads

static __inline __m128i __attribute__((__always_inline__, __artificial__))
_mm_shuffle_epi32 (__m128i __A, const int __mask)
{ 
  return (__m128i)__builtin_ia32_pshufd ((__v4si)__A, __mask);
}

and adjust all !__OPTIMIZE__ macro variants this way.  (at least this makes
this testcase work properly)
Comment 2 Uroš Bizjak 2007-12-12 21:39:56 UTC
(In reply to comment #1)
> My suggestion is to force the same "argument"
> type by doing
> 
> #define _mm_shuffle_epi32(__A, __B) \
>   ((__m128i)__builtin_ia32_pshufd ((__v4si)(__m128i)__A, (int)__B))
>
> and adjust all !__OPTIMIZE__ macro variants this way.  (at least this makes
> this testcase work properly)

Thanks for the suggestion, the patch at http://gcc.gnu.org/ml/gcc-patches/2007-12/msg00560.html implements suggested approach.
> 
Comment 3 uros 2007-12-13 18:19:52 UTC
Subject: Bug 34435

Author: uros
Date: Thu Dec 13 18:19:38 2007
New Revision: 130904

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=130904
Log:
        PR target/34435
        * config/i386/emmintrin.h (_mm_shuffle_pd, _mm_extract_epi16,
        _mm_insert_epi16, _mm_shufflehi_epi16, _mm_shufflelo_epi16,
        _mm_shuffle_epi32): Cast non-constant input values to either __m64,
        __m128, __m128i or __m128d in a macro version of the intrinsic.
        Cast constant input values to int.
        * config/i386/ammintrin.h (_mm_extracti_si64, _mm_inserti_si64): Ditto.
        * config/i386/bmmintrin.h (_mm_roti_epi8, _mm_roti_epi16,
        _mm_roti_epi32, _mm_roti_epi64): Ditto.
        * config/i386/smmintrin.h (_mm_blend_epi16, _mm_blend_ps, _mm_blend_pd,
        _mm_dp_ps, _mm_dp_pd, _mm_insert_ps, _mm_extract_ps, _mm_insert_epi8,
        _mm_insert_epi32, _mm_insert_epi64, _mm_extract_epi8, mm_extract_epi32,
        _mm_extract_epi64, _mm_mpsadbw_epu8, _mm_cmpistrm, _mm_cmpistri,
        _mm_cmpestrm, _mm_cmpestri, _mm_cmpistra, _mm_cmpistrc, _mm_cmpistro,
        _mm_cmpistrs, _mm_cmpistrz, _mm_cmpestra, _mm_cmpestrc, _mm_cmpestro,
        _mm_cmpestrs, _mm_cmpestrz): Ditto.
        * config/i386/tmmintrin.h (_mm_alignr_epi8, _mm_alignr_pi8): Ditto.
        * config/i386/xmmintrin.h (_mm_shuffle_ps, _mm_extract_pi16, _m_pextrw,
        _mm_insert_pi16, _m_pinsrw, _mm_shuffle_pi16, _m_pshufw): Ditto.
        * config/i386/mmintrin-common.h (_mm_round_pd, _mm_round_sd,
        _mm_round_ps, _mm_round_ss): Ditto.

testsuite/ChangeLog:

        PR target/34435
        * g++.dg/other/pr34435.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/other/pr34435.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/ammintrin.h
    trunk/gcc/config/i386/bmmintrin.h
    trunk/gcc/config/i386/emmintrin.h
    trunk/gcc/config/i386/mmintrin-common.h
    trunk/gcc/config/i386/smmintrin.h
    trunk/gcc/config/i386/tmmintrin.h
    trunk/gcc/config/i386/xmmintrin.h
    trunk/gcc/testsuite/ChangeLog

Comment 4 Uroš Bizjak 2007-12-13 18:20:37 UTC
Fixed.