This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: question about DSE
- From: Alex Turjan <aturjan at yahoo dot com>
- To: Michael Matz <matz at suse dot de>
- Cc: gcc at gcc dot gnu dot org
- Date: Wed, 9 Sep 2009 09:59:41 -0700 (PDT)
- Subject: Re: question about DSE
Hi Michael,
> My assumption would be these two split loads of HImode are
> generated by your backend from a given SImode MEM.
Indeed your asumption is right. Bellow I have a mulsi3 expand in which I generate insns of mode HI. operands[1] gets spilled: in the produced BB as a single SI store while in the consumer BB as two separte HI loads (see a_hi and a_lo).
(define_expand "mulsi3"
[(match_operand: SI 0 "general_register_operand" "")
(match_operand: SI 1 "general_register_operand" "")
(match_operand: SI 2 "general_register_operand" "")]
""
"{
rtx buff = gen_reg_rtx(SImode);
rtx a_lo = gen_rtx_SUBREG(HImode, operands[1], 0);
rtx a_hi = gen_rtx_SUBREG(HImode, operands[1], 2);
rtx b_lo = gen_rtx_SUBREG(HImode, operands[2], 0);
rtx b_hi = gen_rtx_SUBREG(HImode, operands[2], 2);
rtx r_hi = gen_rtx_SUBREG(HImode, buff, 2);
emit_insn(gen_umulhisi3(buff, a_lo, b_lo));
emit_insn(gen_machi3(r_hi, a_hi, b_lo, r_hi));
emit_insn(gen_machi3(r_hi, a_lo, b_hi, r_hi));
emit_move_insn(operands[0], buff);
DONE;
}")
> If so, you need
> to make sure to copy the MEM_ALIAS_SET, at least for spill slots (better
> for everything) into the newly generated HImode mems. For spill slots
> it's not enough to set it to zero.
I get your point but as the generation SI->HI takes place in the expand it doesnt help to copy the MEM_ALIAS_SET becasue the operands are pseudo regs.
However, to get a correct implementation I did the following. Instead of doing the split in the expand (as show above), I made use of the following define_insn_and_split:
(define_expand "mulsi3"
[(parallel
[(set (match_operand:SI 0 "register_operand" "")
(mult:SI (match_operand:SI 1 "register_operand" "")
(match_operand:SI 2 "nonmemory_operand" "")))
(clobber (match_operand:SI 3 "register_operand" ""))
]
)
]
""
"{
operands[3] = gen_reg_rtx(SImode);
}")
(define_insn_and_split "*mulsi3"
[(parallel[(set (match_operand:SI 0 "register_operand" "=d,d")
(mult:SI (match_operand:SI 1 "register_operand" "d,d")
(match_operand:SI 2 "nonmemory_operand" "d,I")))
(clobber (match_operand:SI 3 "register_operand" "=d,d"))
])]
""
"#"
"reload_completed"
[(clobber (const_int 0))]
"{
rtx a_lo = gen_rtx_SUBREG(HImode, operands[1], 0);
rtx a_hi = gen_rtx_SUBREG(HImode, operands[1], 2);
rtx b_lo = gen_rtx_SUBREG(HImode, operands[2], 0);
rtx b_hi = gen_rtx_SUBREG(HImode, operands[2], 2);
rtx r_hi = gen_rtx_SUBREG(HImode, operands[3], 2);
emit_insn(gen_umulhisi3(operands[3], a_lo, b_lo));
emit_insn(gen_machi3(r_hi, a_hi, b_lo, r_hi));
emit_insn(gen_machi3(r_hi, a_lo, b_hi, r_hi));
emit_move_insn(operands[0], operands[3]);
DONE;
}")
By using this define_insn_and_split with the predicate "reload_completed"
I ensure that the register allocation takes place on the operands of the "mulsi3" instruction as defined by the define_expand construct. In this way instead of the two separate HI loads (from my previouse mail) I get only one SI load which aliases whith the SI store. In consequence the SI store is no longer removed.
1.What do you think about this implementation? using define_insn_and_split
2.Is is true that in the define_expand constructs I should avoid inducing subregs?
thanks,
Alex