[PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]
Richard Biener
rguenther@suse.de
Mon Jul 12 08:57:14 GMT 2021
On Mon, 12 Jul 2021, guojiufu wrote:
> On 2021-07-12 14:20, Richard Biener wrote:
> > On Fri, 9 Jul 2021, Segher Boessenkool wrote:
> >
> >> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
> >> > I wonder if there's a way to query the target what modes the doloop
> >> > pattern can handle (not being too familiar with the doloop code).
> >>
> >> You can look what modes are allowed for operand 0 of doloop_end,
> >> perhaps? Although that is a define_expand, not a define_insn, so it is
> >> hard to introspect.
> >>
> >> > Why do you need to do any checks besides the new type being able to
> >> > represent all IV values? The original doloop IV will never wrap
> >> > (OTOH if niter is U*_MAX then we compute niter + 1 which will become
> >> > zero ... I suppose the doloop might still do the correct thing here
> >> > but it also still will with a IV with larger type).
>
> The issue comes from U*_MAX (original short MAX), as you said: on which
> niter + 1 becomes zero. And because the step for doloop is -1; then, on
> larger type 'zero - 1' will be a very large number on larger type
> (e.g. 0xff...ff); but on the original short type 'zero - 1' is a small value
> (e.g. "0xff").
But for the larger type the small type MAX + 1 fits and does not yield
zero so it should still work exactly as before, no? Of course you
have to compute the + 1 in the larger type.
> >>
> >> doloop_valid_p guarantees it is simple and doesn't wrap.
> >>
> >> > I'd have expected sth like
> >> >
> >> > ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED
> >> > (ntype));
> >> >
> >> > thus the decision made using a mode - which is also why I wonder
> >> > if there's a way to query the target for this. As you say,
> >> > it _may_ be fast, so better check (somehow).
>
>
> I was also thinking of using hooks like type_for_size/type_for_mode.
> /* Use type in word size may fast. */
> if (TYPE_PRECISION (ntype) < BITS_PER_WORD
> && Wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE
> (ntype))))
> {
> ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
> base = fold_convert (ntype, base);
> }
>
> As you pointed out, this does not query the mode from targets.
> As Segher pointed out "doloop_end" checks unsupported mode, while it seems
> not easy to use it in tree-ssa-loop-ivopts.c.
> For implementations of doloop_end, tartgets like rs6000/aarch64/ia64 requires
> Pmode/DImode; while there are other targets that work on other 'mode' (e.g.
> SI).
>
>
> In doloop_optimize, there is code:
>
> ```
> mode = desc->mode;
> .....
> doloop_reg = gen_reg_rtx (mode);
> rtx_insn *doloop_seq = targetm.gen_doloop_end (doloop_reg,
> start_label);
>
> word_mode_size = GET_MODE_PRECISION (word_mode);
> word_mode_max = (HOST_WIDE_INT_1U << (word_mode_size - 1) << 1) - 1;
> if (! doloop_seq
> && mode != word_mode
> /* Before trying mode different from the one in that # of
> iterations is
> computed, we must be sure that the number of iterations fits into
> the new mode. */
> && (word_mode_size >= GET_MODE_PRECISION (mode)
> || wi::leu_p (iterations_max, word_mode_max)))
> {
> if (word_mode_size > GET_MODE_PRECISION (mode))
> count = simplify_gen_unary (ZERO_EXTEND, word_mode, count, mode);
> else
> count = lowpart_subreg (word_mode, count, mode);
> PUT_MODE (doloop_reg, word_mode);
> doloop_seq = targetm.gen_doloop_end (doloop_reg, start_label);
> }
> if (! doloop_seq)
> {
> if (dump_file)
> fprintf (dump_file,
> "Doloop: Target unwilling to use doloop pattern!\n");
> return false;
> }
> ```
> The above code first tries the mode of niter_desc by call
> targetm.gen_doloop_end
> to see if the target can generate doloop insns, if fail, then try to use
> 'word_mode' against gen_doloop_end.
>
>
> >>
> >> Almost all targets just use Pmode, but there is no such guarantee I
> >> think, and esp. some targets that do not have machine insns for this
> >> (but want to generate different code for this anyway) can do pretty much
> >> anything.
> >>
> >> Maybe using just Pmode here is good enough though?
> >
> > I think Pmode is a particularly bad choice and I'd prefer word_mode
> > if we go for any hardcoded mode. s390x for example seems to handle
> > both SImode and DImode (but names the helper gen_doloop_si64
> > for SImode?!). But indeed it looks like somehow querying doloop_end
> > is going to be difficult since the expander doesn't have any mode,
> > so we'd have to actually try emit RTL here.
>
> Instead of using hardcode mode, maybe we could add a hook for targets to
> return
> the preferred mode.
That's a possiblity of course. Like the following which just shows the
default implementation then (pass in current mode, return a more preferred
mode or the mode itself)
enum machine_mode
prefered_doloop_mode (enum machine_mode mode)
{
return mode;
}
>
> Thanks for those valuable comments!
>
> Jiufu Guo
>
>
>
> >
> > Richard.
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
More information about the Gcc-patches
mailing list