This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [GCC RFC]A new and simple pass merging paired load store instructions
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: Steven Bosscher <stevenb dot gcc at gmail dot com>, "bin.cheng" <bin dot cheng at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 19 May 2014 14:38:27 +0800
- Subject: Re: [GCC RFC]A new and simple pass merging paired load store instructions
- Authentication-results: sourceware.org; auth=none
- References: <004d01cf700e$ef1e30e0$cd5a92a0$ at arm dot com> <CABu31nMY6zapHfhr5x4BjZ3kvFuEKuhagBfx2cYbD4bbSwybTg at mail dot gmail dot com> <CAHFci2_PoNZVA15gKGDPet73UeEsay3-Ez6qKDu=E7PUVxgeiA at mail dot gmail dot com> <53763DAA dot 1030104 at redhat dot com>
On Sat, May 17, 2014 at 12:32 AM, Jeff Law <law@redhat.com> wrote:
> On 05/16/14 04:07, Bin.Cheng wrote:
>
>> Yes, I think this one does have a good reason. The target independent
>> pass just makes sure that two consecutive memory access instructions
>> are free of data-dependency with each other, then feeds it to back-end
>> hook. It's back-end's responsibility to generate correct instruction.
>
> But given these two memory access insns, there's only a couple ways they're
> likely to combine into a single insn. We could just as easily have the
> target independent code construct a new insn then try to recognize it. If
> it's not recognized, then try the other way.
>
> Or is it the case that we're doing something beyond upsizing the mode?
>
>
>
>> It's not about modifying an existing insn then recognize it, it's
>> about creating new instruction sometimes. For example, we can
>> generate a simple move insn in Arm mode, while have to generate a
>> parallel instruction in Thumb mode. Target independent part has no
>> idea how to generate an expected insn. Moreover, back-end may check
>> some special conditions too.
>
> But can't you go through movXX to generate either the simple insn on the ARM
> or the PARALLEL on the thumb?
>
Yes, I think it's more than upsizing the mode. There is another
example from one of x86's candidate peephole patch at
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html
The patch wants to do below transformation, which I think is very
target dependent.
+(define_peephole2
+ [(set (match_operand:DF 0 "register_operand")
+ (match_operand:DF 1 "memory_operand"))
+ (set (match_operand:V2DF 2 "register_operand")
+ (vec_concat:V2DF (match_dup 0)
+ (match_operand:DF 3 "memory_operand")))]
+ "TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+ && REGNO (operands[0]) == REGNO (operands[2])
+ && adjacent_mem_locations (operands[1], operands[3])"
+ [(set (match_dup 2)
+ (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))]
+
+;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+;; is true.
+(define_peephole2
+ [(set (match_operand:DF 0 "memory_operand")
+ (vec_select:DF (match_operand:V2DF 1 "register_operand")
+ (parallel [(const_int 0)])))
+ (set (match_operand:DF 2 "memory_operand")
+ (vec_select:DF (match_dup 1)
+ (parallel [(const_int 1)])))]
+ "TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+ && adjacent_mem_locations (operands[0], operands[2])"
+ [(set (match_dup 3)
+ (unspec:V2DF [(match_dup 1)] UNSPEC_STOREU))]
Thanks,
bin
--
Best Regards.