This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] For -Os change movabsq $(imm32 << shift), %rX[xip] to movl $imm2, %eX[xip]; shl $shift, %rX[xip] (PR target/82339)
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Alexander Monakov <amonakov at ispras dot ru>, "H.J. Lu" <hjl dot tools at gmail dot com>, Peter Cordes <peter at cordes dot ca>
- Date: Thu, 28 Sep 2017 09:13:23 +0200
- Subject: Re: [PATCH] For -Os change movabsq $(imm32 << shift), %rX[xip] to movl $imm2, %eX[xip]; shl $shift, %rX[xip] (PR target/82339)
- Authentication-results: sourceware.org; auth=none
- References: <20170927133617.GX1701@tucnak>
On Wed, Sep 27, 2017 at 3:36 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> Doing a movl + shlq by constant seems to be 1 byte shorter
> than movabsq, so this patch attempts to use the former form
> unless flags is live.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Performance-wise, not really sure what is a win (on i7-5960X on the
> testcase in the PR movl + shlq seems to be significantly faster, but e.g.
> on i7-2600 it is the same) so not doing anything for speed yet.
>
> 2017-09-27 Jakub Jelinek <jakub@redhat.com>
>
> PR target/82339
> * config/i386/i386.md (*movdi_internal peephole2): New -Os peephole
> for movabsq $(i32 << shift), r64.
LGTM, but also have no idea about performance impact ...
Uros.
> --- gcc/config/i386/i386.md.jj 2017-09-21 09:26:42.000000000 +0200
> +++ gcc/config/i386/i386.md 2017-09-27 10:24:01.520673889 +0200
> @@ -2379,6 +2379,28 @@ (define_split
> gen_lowpart (SImode, operands[1]));
> })
>
> +;; movabsq $0x0012345678000000, %rax is longer
> +;; than movl $0x12345678, %eax; shlq $24, %rax.
> +(define_peephole2
> + [(set (match_operand:DI 0 "register_operand")
> + (match_operand:DI 1 "const_int_operand"))]
> + "TARGET_64BIT
> + && optimize_insn_for_size_p ()
> + && LEGACY_INT_REG_P (operands[0])
> + && !x86_64_immediate_operand (operands[1], DImode)
> + && !x86_64_zext_immediate_operand (operands[1], DImode)
> + && !((UINTVAL (operands[1]) >> ctz_hwi (UINTVAL (operands[1])))
> + & ~(HOST_WIDE_INT) 0xffffffff)
> + && peep2_regno_dead_p (0, FLAGS_REG)"
> + [(set (match_dup 0) (match_dup 1))
> + (parallel [(set (match_dup 0) (ashift:DI (match_dup 0) (match_dup 2)))
> + (clobber (reg:CC FLAGS_REG))])]
> +{
> + int shift = ctz_hwi (UINTVAL (operands[1]));
> + operands[1] = gen_int_mode (UINTVAL (operands[1]) >> shift, DImode);
> + operands[2] = gen_int_mode (shift, QImode);
> +})
> +
> (define_insn "*movsi_internal"
> [(set (match_operand:SI 0 "nonimmediate_operand"
> "=r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m ,?r ,?*Yi,*k,*k ,*rm")
>
> Jakub