This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd

From: "ubizjak at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Thu, 17 Sep 2015 13:13:04 +0000
Subject: [Bug target/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
Auto-submitted: auto-generated
References: <bug-67609-4 at http dot gcc dot gnu dot org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #3 from UroÅ Bizjak <ubizjak at gmail dot com> ---
The doc says:

          When used as an lvalue, 'subreg' is a word-based accessor.
          Storing to a 'subreg' modifies all the words of REG that
          overlap the 'subreg', but it leaves the other words of REG
          alone.

          When storing to a normal 'subreg' that is smaller than a word,
          the other bits of the referenced word are usually left in an
          undefined state.  This laxity makes it easier to generate
          efficient code for such instructions.  To represent an
          instruction that preserves all the bits outside of those in
          the 'subreg', use 'strict_low_part' or 'zero_extract' around
          the 'subreg'.

However, we expand assignment to v[0] with:

;; v[0] = b_4(D);

(insn 7 6 0 (set (subreg:DF (reg/v:TI 90 [ v ]) 0)
        (reg/v:DF 88 [ b ])) pr67609.c:8 -1
     (nil))

According to the above explanation, a strict_low_part should be used here.

I think this is middle-end, not a target problem.

References:
- [Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd
  - From: bisqwit at iki dot fi

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]