This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Intrinsic functions for SPARC VIS instructions.
Eric Botcazou <ebotcazou@libertysurf.fr> writes:
> > This patch adds a bunch of intrinsics functions for harder to express
> > instructions. This has been tested on sparc-linux and sparc64-linux with
> > no new regressions. All the new tests pass.
>
> +
> +/* Create builtin functions for VIS 1 instructions. */
> +
>
> I think Sun uses VIS 1.0 throughout the documentation.
Thanks.
>
> + tree ptr_ftype_ptr_ptr = build_function_type_list (ptr_type_node,
> + ptr_type_node,
> + ptr_type_node, 0);
> [...]
>
> + def_builtin ("__builtin_vis_alignaddr", CODE_FOR_alignaddrsi_vis,
> + ptr_ftype_ptr_ptr);
>
> The VIS 2.1 manual advertises the following prototype:
> void *vis_alignaddr(void *addr, int offset);
>
> Do you have another source?
I've been using the UltraSPARC IIi users manual. The uses Sun expects
are addr + offset, so I guess we should use that as well.
>
> + /* Data aligning. */
> + def_builtin ("__builtin_vis_faligndatav4hi", CODE_FOR_faligndatav4hi_vis,
> + v4hi_ftype_v4hi_v4hi);
> + def_builtin ("__builtin_vis_faligndatav8qi", CODE_FOR_faligndatav8qi_vis,
> + v8qi_ftype_v8qi_v8qi);
> + def_builtin ("__builtin_vis_faligndatav2si", CODE_FOR_faligndatav2si_vis,
> + v2si_ftype_v2si_v2si);
>
> I think you should define __builtin_vis_faligndatadi too.
Ok.
>
> + mode[arg_count] = tmode;
> + op[arg_count] = target;
> +
> + if (icode == CODE_FOR_alignaddrsi_vis && Pmode == DImode)
> + icode = CODE_FOR_alignaddrdi_vis;
> +
> + if (target == 0
> + || GET_MODE (target) != tmode
> + || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> + op[arg_count] = gen_reg_rtx (tmode);
>
> Please use an 'else' so that op[arg_count] is not written twice.
Ok.
> I see you don't generate the move if the target is not NULL but not used as
> op[0]. I presume it's the correct behaviour for TARGET_EXPAND_BUILTIN?
>
>
> + for (; arglist; arglist = TREE_CHAIN (arglist))
> + {
> + tree arg = TREE_VALUE (arglist);
>
> I personally don't like very much 'for' loops without initializer. Please
> move the initialization of 'arglist' to here.
I've fixed this for the next patch.
>
> + gcc_assert (arg_count < 4);
>
> + switch (arg_count)
> + {
> [...]
> + default:
> + gcc_unreachable ();
> + }
>
> I think this is redundant. I would remove the assert.
Except the assert ensures we don't write over random memory. That
doesn't really matter since we would abort anyway.
>
> +
> +;; Hard to generate VIS instructions. We have builtins for these
>
> Missing period and newline (it's a multi-insn comment).
Ok.
> +;; It may be possible to describe this operation as (1 indexed):
> +;; (vec_select (vec_duplicate (vec_duplicate (vec_concat 1 2)))
> +;; 1,5,10,14,19,23,28,32)
> +;; However (vec_merge:V8QI [(V4QI) (V4QI)] (10101010 = 170) doesn't work!
> +(define_insn "fpmerge_vis"
> + [(set (match_operand:V8QI 0 "register_operand" "=e")
> + (unspec:V8QI [(match_operand:V4QI 1 "register_operand" "f")
> + (match_operand:V4QI 2 "register_operand" "f")]
> + UNSPEC_FPMERGE))]
> + "TARGET_VIS"
> + "fpmerge\t%1, %2, %0"
> + [(set_attr "type" "fga")
> + (set_attr "fptype" "double")])
>
> It could be nice to write the modes in the (vec_select (...)) expression. And
> I don't understand the 'However': is there a relationship between the two
> remarks? If no, I think that 'Note that' would be better.
There is no relationship. I over use the word 'however'.
> I presume vec_merge wants the same mode for all its operands?
Yes. I've updated the comment.
> +;; Only one of the following two insns can be a multiply.
> +(define_insn "fmul8x16au_vis"
> + [(set (match_operand:V4HI 0 "register_operand" "=e")
> + (mult:V4HI (match_operand:V4QI 1 "register_operand" "f")
> + (match_operand:V2HI 2 "register_operand" "f")))]
> + "TARGET_VIS"
> + "fmul8x16au\t%1, %2, %0"
> + [(set_attr "type" "fpmul")
> + (set_attr "fptype" "double")])
> +
> +(define_insn "fmul8x16al_vis"
> + [(set (match_operand:V4HI 0 "register_operand" "=e")
> + (unspec:V4HI [(match_operand:V4QI 1 "register_operand" "f")
> + (match_operand:V2HI 2 "register_operand" "f")]
> + UNSPEC_MUL16AL))]
> + "TARGET_VIS"
> + "fmul8x16al\t%1, %2, %0"
> + [(set_attr "type" "fpmul")
> + (set_attr "fptype" "double")])
>
> How did you chose which one deserves to be modelled as a multiply? And does
> it make sense to model it as a multiply in the first place? Same for the
> other 2 pairs.
The first one. From the testing I've done I've noticed that multiplying by
0 returns zero, but multiplying by non-zero, e.g. 1, generates a fmul8?x16?
instruction, which is what we want.
>
> +;; This probably isn't exactly safe for DF mode.
> +;; In fact, it's not meant for anything other than V8QI. However, since a
> +;; short* or int* will be aligned along 2 byte and 4 byte boundaries anyway,
> +;; this will work for V4HI and V2SI as well.
> +(define_insn "faligndata<V64:mode>_vis"
> + [(set (match_operand:V64 0 "register_operand" "=e")
> + (unspec:V64 [(match_operand:V64 1 "register_operand" "e")
> + (match_operand:V64 2 "register_operand" "e")]
> + UNSPEC_ALIGNDATA))]
> + "TARGET_VIS"
> + "faligndata\t%1, %2, %0"
> + [(set_attr "type" "fga")
> + (set_attr "fptype" "double")])
>
> I think you should use V64I here, this would eliminate the DFmode problem.
> And I don't really understand the second part of the comment: why do you
> think the alignment of submode should be taken into account ?
The thing is faligndata uses part of the gsr to align select which parts of
%1 and %2 are put into %0. So if someone where to do something crazy like
align the address of a pointer that is 3 bytes off an 8 byte boundary then
load two V4HI vectors and try to align those two vectors, the result would
not be what was expected.
>
> +(define_mode_macro P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
> +
> +(define_insn "alignaddr<P:mode>_vis"
> + [(set (match_operand:P 0 "register_operand" "=r")
> + (unspec:P [(match_operand:P 1 "reg_or_0_operand" "%rJ")
> + (match_operand:P 2 "register_operand" "r")]
> + UNSPEC_ALIGNADDR))]
> + "TARGET_VIS"
> + "alignaddr\t%1, %2, %0")
>
> I think the insn is fully symmetric wrt to its 2 operands so you should write
> a fully symmetric pattern here. You also need to use '%r' for operands that
> might be zero (see sparc.c:print_operand).
Are you saying both operands should be:
(match_operand:P ? "reg_or_0_operand" "%r")
> It occured to me that we could use alignaddr/faligndata to implement
> byte-aligned load operations in 64-bit mode (see pages 70 and 71 in the VIS
> manual). It turns out that this would exactly match RTH's unaligned load
> patch http://gcc.gnu.org/ml/gcc-patches/2004-11/msg00444.html when the
> bitfield is byte-aligned (probably modulo signedness).
Probably, we can also use them to fixed the vectorizer fails. I don't know
how to do that yet though.
>
> Btw, is it mandatory to have non *-prefixed patterns to have the corresponding
> CODE_FOR_* symbols? If no, I think all patterns should be prefixed with '*'.
Yes, using non-* patterns makes the expander much simpler.
> --
> Eric Botcazou
>
--
Thanks,
Jim
http://www.student.cs.uwaterloo.ca/~ja2morri/
http://phython.blogspot.com
http://open.nit.ca/wiki/?page=jim