This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #23 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 17 Feb 2016, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671
> 
> --- Comment #22 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Created attachment 37722
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37722&action=edit
> gcc6-pr69671.patch
> 
> Actually, on a closer look, I believe the only problem are the patterns that
> use a vector_move_operand "0C" inside of vec_select with only constants as the
> parallel's operands.  Because fwprop is able to propagate constants into
> instructions (thus undo the CSE effect), but doesn't do anything on these,
> because it also simplifies them, so instead of the expected say
>                 (vec_select:V4QI (const_vector:V16QI [
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                             (const_int 0 [0])
>                         ])
>                     (parallel [
>                             (const_int 0 [0])
>                             (const_int 1 [0x1])
>                             (const_int 2 [0x2])
>                             (const_int 3 [0x3])
>                         ]))
> we get in there simplified:
>                 (const_vector:V4QI [
>                         (const_int 0 [0])
>                         (const_int 0 [0])
>                         (const_int 0 [0])
>                         (const_int 0 [0])
>                     ])
> So, by adding extra patterns for that simplification fwprop is able to do its
> job even if CSE did a better job.

Of course then I wonder why we didn't simplify this in the first place
when generating RTL and need to wait for forwprop ...

But yes, sounds like the easiest way to go forward.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]