[PATCH 2/2] Enable elimination of zext/sext

Richard Biener richard.guenther@gmail.com
Wed Aug 27 10:07:00 GMT 2014


On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
>> 2014-08-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> * calls.c (precompute_arguments): Check
>> promoted_for_signed_and_unsigned_p and set the promoted mode.
>> (promoted_for_signed_and_unsigned_p): New function.
>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p
>> and set the promoted mode.
>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition.
>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
>
> This patch regresses:
>
> Running target unix
> FAIL: libgomp.fortran/simd7.f90   -O2  execution test
> FAIL: libgomp.fortran/simd7.f90   -Os  execution test
>
> on alphaev6-linux-gnu.
>
> The problem can be illustrated with attached testcase with a
> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in
> missing SImode extension after DImode shift of SImode subregs for this
> part:
>
> --cut here--
>   # test.23_12 = PHI <0(37), 1(36)>
>   _242 = ivtmp.181_73 + 2147483645;
>   _240 = _242 * 2;
>   _63 = (integer(kind=4)) _240;
>   if (ubound.6_99 <= 2)
>     goto <bb 39>;
>   else
>     goto <bb 40>;
> ;;    succ:       39
> ;;                40
>
> ;;   basic block 39, loop depth 1
> ;;    pred:       38
>   pretmp_337 = test.23_12 | l_76;
>   goto <bb 45>;
> ;;    succ:       45
>
> ;;   basic block 40, loop depth 1
> ;;    pred:       38
>   _11 = *c_208[0];
>   if (_11 != _63)
>     goto <bb 45>;
>   else
>     goto <bb 42>;
> --cut here--
>
> this expands to:
>
> (code_label 592 591 593 35 "" [0 uses])
>
> (note 593 592 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _63 = (integer(kind=4)) _240;
>
> (insn 594 593 595 (set (reg:SI 538)
>         (const_int 1073741824 [0x40000000])) -1
>      (nil))
>
> (insn 595 594 596 (set (reg:SI 539)
>         (plus:SI (reg:SI 538)
>             (const_int 1073741824 [0x40000000]))) -1
>      (nil))
>
> (insn 596 595 597 (set (reg:SI 537)
>         (plus:SI (reg:SI 539)
>             (const_int -3 [0xfffffffffffffffd]))) -1
>      (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd])
>         (nil)))
>
> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ])
>         (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0)
>             (reg:SI 537))) -1
>      (nil))
>
> (insn 598 597 599 (set (reg:DI 540)
>         (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0)
>             (const_int 1 [0x1]))) -1
>      (nil))
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (reg:DI 540)) -1
>      (nil))
>
> ...
>
> (note 610 609 0 NOTE_INSN_BASIC_BLOCK)
>
> ;; _11 = *c_208[0];
>
> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ])
>         (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4
> A128]))) simd7.f90:12 -1
>      (nil))
>
> ;; if (_11 != _63)
>
> (insn 612 611 613 40 (set (reg:DI 545)
>         (eq:DI (reg:DI 120 [ D.1694 ])
>             (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1
>      (nil))
>
> (jump_insn 613 612 616 40 (set (pc)
>         (if_then_else (eq (reg:DI 545)
>                 (const_int 0 [0]))
>             (label_ref 0)
>             (pc))) simd7.f90:12 -1
>      (int_list:REG_BR_PROB 450 (nil)))
>
> which results in following asm:
>
> $L35:
>     addl $25,$7,$2     # 597    addsi3/1    [length = 4]
>     addq $2,$2,$2     # 598    ashldi3/1    [length = 4]     <------ here
>     bne $24,$L145     # 601    *bcc_normal    [length = 4]
>     lda $4,4($20)     # 627    *adddi_internal/2    [length = 4]
>     ldl $8,0($20)     # 611    *extendsidi2_1/2    [length = 4]
>     lda $3,3($31)     # 74    *movdi/2    [length = 4]
>     cmpeq $8,$2,$2     # 612    *setcc_internal    [length = 4]  <-- compare
>     bne $2,$L40     # 613    *bcc_normal    [length = 4]
>     br $31,$L88     # 2403    jump    [length = 4]
>     .align 4
> ...
>
> Tracking the values with the debugger shows wrong calculation:
>
>    0x000000012000108c <+1788>:  addl    t10,t12,t1
>    0x0000000120001090 <+1792>:  addq    t1,t1,t1
>    ...
>    0x00000001200010a4 <+1812>:  cmpeq   t6,t1,t1
>    0x00000001200010a8 <+1816>:  bne     t1,0x1200010c0 <foo_+1840>
>
> (gdb) si
> 0x000000012000108c      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t10 t12
> t10            0x7      7
> t12            0x7ffffffd       2147483645
>
> (gdb) si
> 0x0000000120001090      17          l = l .or. any (b /= 7 + i)
> (gdb) i r t1
> t1             0xffffffff80000004       -2147483644
>
> (gdb) si
> 18          l = l .or. any (c /= 8 + 2 * i)
> (gdb) i r t1
> t1             0xffffffff00000008       -4294967288
>
> At this point, the calculation should zero-extend SImode value to full
> DImode, since compare operates on DImode values. The problematic insn
> is (insn 599), which is now a DImode assignment instead of
> zero-extend, due to:
>
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt)
>    GET_MODE (target), temp, unsignedp);
>    }
>
> - convert_move (SUBREG_REG (target), temp, unsignedp);
> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
> +    && (GET_CODE (temp) == SUBREG)
> +    && (GET_MODE (target) == GET_MODE (temp))
> +    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
> +  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
> + else
> +  convert_move (SUBREG_REG (target), temp, unsignedp);
>        }
>      else if (nontemporal && emit_storent_insn (target, temp))
>        ;
>
> When compiling this code, we have:
>
> lhs = _63
> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0)
> temp = (subreg:SI (reg:DI 540) 0)
>
> So, the code assumes that it is possible to copy (reg:DI 540) directly
> to (reg:DI 154). However, this is not the case, since we still have
> garbage in the top 32bits.
>
> Reverting the part above fixes the runtime failure, since (insn 599) is now:
>
> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ])
>         (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1
>      (nil))
>
> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*.

Yeah, that makes sense.

Richard.

> Uros.



More information about the Gcc-patches mailing list