This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Emit vperm2[if]128 $0x12/$0x20 as vinsert[if]128 $0/$1
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Tue, 8 Nov 2011 11:29:32 +0100
- Subject: Re: [PATCH] Emit vperm2[if]128 $0x12/$0x20 as vinsert[if]128 $0/$1
- References: <20111107212043.GN27375@tyan-ft48-01.lab.bos.redhat.com>
On Mon, Nov 7, 2011 at 10:20 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> I think it is at least more readable and perhaps for some CPUs could
> be faster (for SandyBridge it is the same speed) if we emit a more
> specialized insn over a more generic one.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> In the attachment is my first attempt to do this, in the expander,
> unfortunately that turned out to be pessimizing - seems like IRA or
> reload has issues with the subregs and on
No, it is by design. Please see the comment in
ix86_cannot_change_mode_class, why we prohibit all nonparadoxical
subregs changing size for SSE/MMX classes.
> 2011-11-07 ?Jakub Jelinek ?<jakub@redhat.com>
>
> ? ? ? ?* config/i386/sse.md (*avx_vperm2f128<mode>3_nozero): Emit mask
> ? ? ? ?0x12 and 0x20 as vinsert[fi]128 instead of vperm2[fi]128.
OK.
Thanks,
Uros.