This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GCC RFC]A new and simple pass merging paired load store instructions


On Sat, May 17, 2014 at 12:32 AM, Jeff Law <law@redhat.com> wrote:
> On 05/16/14 04:07, Bin.Cheng wrote:
>
>> Yes, I think this one does have a good reason.  The target independent
>> pass just makes sure that two consecutive memory access instructions
>> are free of data-dependency with each other, then feeds it to back-end
>> hook.  It's back-end's responsibility to generate correct instruction.
>
> But given these two memory access insns, there's only a couple ways they're
> likely to combine into a single insn.  We could just as easily have the
> target independent code construct a new insn then try to recognize it.  If
> it's not recognized, then try the other way.
>
> Or is it the case that we're doing something beyond upsizing the mode?
>
>
>
>>   It's not about modifying an existing insn then recognize it, it's
>> about creating new instruction sometimes.  For example, we can
>> generate a simple move insn in Arm mode, while have to generate a
>> parallel instruction in Thumb mode.  Target independent part has no
>> idea how to generate an expected insn.  Moreover, back-end may check
>> some special conditions too.
>
> But can't you go through movXX to generate either the simple insn on the ARM
> or the PARALLEL on the thumb?
>
Yes, I think it's more than upsizing the mode.  There is another
example from one of x86's candidate peephole patch at
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html

The patch wants to do below transformation, which I think is very
target dependent.

+(define_peephole2
+  [(set (match_operand:DF 0 "register_operand")
+       (match_operand:DF 1 "memory_operand"))
+   (set (match_operand:V2DF 2 "register_operand")
+       (vec_concat:V2DF (match_dup 0)
+        (match_operand:DF 3 "memory_operand")))]
+  "TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+   && REGNO (operands[0]) == REGNO (operands[2])
+   && adjacent_mem_locations (operands[1], operands[3])"
+  [(set (match_dup 2)
+       (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))]
+
+;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+;; is true.
+(define_peephole2
+  [(set (match_operand:DF 0 "memory_operand")
+        (vec_select:DF (match_operand:V2DF 1 "register_operand")
+                      (parallel [(const_int 0)])))
+   (set (match_operand:DF 2 "memory_operand")
+        (vec_select:DF (match_dup 1)
+                       (parallel [(const_int 1)])))]
+  "TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+   && adjacent_mem_locations (operands[0], operands[2])"
+  [(set (match_dup 3)
+        (unspec:V2DF [(match_dup 1)] UNSPEC_STOREU))]

Thanks,
bin


-- 
Best Regards.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]