[PATCH] Allow fwprop to undo vectorization harm (PR68961)
Uros Bizjak
ubizjak@gmail.com
Tue Jul 12 22:08:00 GMT 2016
On Sun, Jul 10, 2016 at 10:12 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Jul 6, 2016 at 3:18 PM, Richard Biener <rguenther@suse.de> wrote:
>
>>> > 2016-07-04 Richard Biener <rguenther@suse.de>
>>> >
>>> > PR rtl-optimization/68961
>>> > * fwprop.c (propagate_rtx): Allow SUBREGs of VEC_CONCAT and CONCAT
>>> > to simplify to a non-constant.
>>> >
>>> > * gcc.target/i386/pr68961.c: New testcase.
>>>
>>> Thanks, LGTM.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, it causes
>>
>> FAIL: gcc.target/i386/sse2-load-multi.c scan-assembler-times movup 2
>>
>> as the peephole created for that testcase no longer applies as fwprop
>> does
>>
>> In insn 10, replacing
>> (vec_concat:V2DF (vec_select:DF (reg:V2DF 91)
>> (parallel [
>> (const_int 0 [0])
>> ]))
>> (mem:DF (reg/f:DI 95) [0 S8 A128]))
>> with (vec_concat:V2DF (reg:DF 93 [ MEM[(const double *)&a + 8B] ])
>> (mem:DF (reg/f:DI 95) [0 S8 A128]))
>> Changed insn 10
>>
>> resulting in
>>
>> movsd a+8(%rip), %xmm0
>> movhpd a+16(%rip), %xmm0
>>
>> again rather than movupd.
>>
>> Uros, there is probably a missing peephole for the new form - can you
>> fix this as a followup or should I hold on this patch for a bit longer?
>
> No, please proceed with the patch, I'll fix this fallout with a
> followup patch in a couple of days.
Fixed with attached patch.
2016-07-13 Uros Bizjak <ubizjak@gmail.com>
PR rtl-optimization/68961
* config/i386/sse.md (movsd/movhpd to movupd peephole2s): Add new
peephole variant. Use sse_reg_operand predicates.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Committed to mainline SVN.
Uros.
-------------- next part --------------
Index: config/i386/sse.md
===================================================================
--- config/i386/sse.md (revision 238258)
+++ config/i386/sse.md (working copy)
@@ -1169,10 +1169,10 @@
;; Merge movsd/movhpd to movupd for TARGET_SSE_UNALIGNED_LOAD_OPTIMAL targets.
(define_peephole2
- [(set (match_operand:V2DF 0 "register_operand")
+ [(set (match_operand:V2DF 0 "sse_reg_operand")
(vec_concat:V2DF (match_operand:DF 1 "memory_operand")
(match_operand:DF 4 "const0_operand")))
- (set (match_operand:V2DF 2 "register_operand")
+ (set (match_operand:V2DF 2 "sse_reg_operand")
(vec_concat:V2DF (vec_select:DF (match_dup 2)
(parallel [(const_int 0)]))
(match_operand:DF 3 "memory_operand")))]
@@ -1181,13 +1181,25 @@
[(set (match_dup 2) (match_dup 4))]
"operands[4] = adjust_address (operands[1], V2DFmode, 0);")
+(define_peephole2
+ [(set (match_operand:DF 0 "sse_reg_operand")
+ (match_operand:DF 1 "memory_operand"))
+ (set (match_operand:V2DF 2 "sse_reg_operand")
+ (vec_concat:V2DF (match_operand:DF 4 "sse_reg_operand")
+ (match_operand:DF 3 "memory_operand")))]
+ "TARGET_SSE2 && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+ && REGNO (operands[4]) == REGNO (operands[2])
+ && ix86_operands_ok_for_move_multiple (operands, true, DFmode)"
+ [(set (match_dup 2) (match_dup 4))]
+ "operands[4] = adjust_address (operands[1], V2DFmode, 0);")
+
;; Merge movlpd/movhpd to movupd for TARGET_SSE_UNALIGNED_STORE_OPTIMAL targets.
(define_peephole2
[(set (match_operand:DF 0 "memory_operand")
- (vec_select:DF (match_operand:V2DF 1 "register_operand")
+ (vec_select:DF (match_operand:V2DF 1 "sse_reg_operand")
(parallel [(const_int 0)])))
(set (match_operand:DF 2 "memory_operand")
- (vec_select:DF (match_operand:V2DF 3 "register_operand")
+ (vec_select:DF (match_operand:V2DF 3 "sse_reg_operand")
(parallel [(const_int 1)])))]
"TARGET_SSE2 && TARGET_SSE_UNALIGNED_STORE_OPTIMAL
&& ix86_operands_ok_for_move_multiple (operands, false, DFmode)"
More information about the Gcc-patches
mailing list