This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Mode change for bswap pattern expansion
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Paulo Matos <pmatos at broadcom dot com>
- Cc: "gcc\ at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Mon, 27 Jan 2014 16:05:48 +0000
- Subject: Re: Mode change for bswap pattern expansion
- Authentication-results: sourceware.org; auth=none
- References: <19EB96622A777C4AB91610E763265F463F12AC at SJEXCHMB14 dot corp dot ad dot broadcom dot com>
Paulo Matos <pmatos@broadcom.com> writes:
> On a vector processor we can do a bswapsi with two instructions, by first rotating half-words (16 bits) by 8 and then rotating full words by 16.
> However, this means expanding:
> (set (match_operand:SI 0 "register_operand" "")
> (bswap:SI (match_operand:SI 1 "register_operand" "")))
>
> to:
> (set (match_dup:V2HI 0)
> (rotate:V2HI (match_dup:V2HI 1)
> (const_int 8)))
> (set (match_dup:SI 0)
> (rotate:SI (match_dup:SI 0)
> (const_int 16)))
>
> This is obviously not correct, because match_dup cannot set the mode. The point I am trying to make is that I can't find a good way to deal with the mode changes. I don't think GCC is too happy if I change the modes of the same operand from one instruction to the other right? The only other way is to emit paradoxical subregs. So something along these lines:
> (set (subreg:V2HI (match_dup 0) 0)
> (rotate:V2HI (subreg:V2HI (match_dup 1) 0)
> (const_int 8)))
> (set (match_dup 0)
> (rotate:SI (match_dup 0)
> (const_int 16)))
It's usually better not to hard-code the subregs in the pattern.
Instead you could use C code to create the subregs, e.g.:
[(set (match_dup 3)
(rotate:V2HI (match_dup 2)
(const_int 8)))
(set (match_dup 0)
(rotate:SI (match_dup 4)
(const_int 16)))]
""
{
operands[2] = gen_lowpart (V2HImode, operands[1]);
operands[3] = gen_reg_rtx (V2HImode);
operands[4] = gen_lowpart (SImode, operands[3]);
}
so that any hard regs are correctly handled. Or it might be easier to code
it using emit_insn (gen_* (...))s instead.
BTW, paradoxical subregs are where the outer mode is strictly larger
than the inner mode.
MIPS uses essentially the same sequence, except that it has a special
instruction to do the first rotate (WSBH), rather than it being an instance
of a general vector rotate. For MIPS we just model it as an unspec SImode
operation. Maybe that would be easier here too.
Thanks,
Richard