This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] Intrinsic functions for SPARC VIS instructions.



Eric Botcazou <ebotcazou@libertysurf.fr> writes:

> >   This patch adds a bunch of intrinsics functions for harder to express
> > instructions.  This has been tested on sparc-linux and sparc64-linux with
> > no new regressions.  All the new tests pass.
> 
> +
> +/* Create builtin functions for VIS 1 instructions.  */
> +
> 
> I think Sun uses VIS 1.0 throughout the documentation.

 Thanks.

> 
> +  tree ptr_ftype_ptr_ptr = build_function_type_list (ptr_type_node,
> +						     ptr_type_node,
> +						     ptr_type_node, 0);
> [...]
> 
> +  def_builtin ("__builtin_vis_alignaddr", CODE_FOR_alignaddrsi_vis,
> +	       ptr_ftype_ptr_ptr);
> 
> The VIS 2.1 manual advertises the following prototype:
>   void *vis_alignaddr(void *addr, int offset);
> 
> Do you have another source?

 I've been using the UltraSPARC IIi users manual.  The uses Sun expects
are addr + offset, so I guess we should use that as well.

> 
> +  /* Data aligning.  */
> +  def_builtin ("__builtin_vis_faligndatav4hi", CODE_FOR_faligndatav4hi_vis,
> +	       v4hi_ftype_v4hi_v4hi);
> +  def_builtin ("__builtin_vis_faligndatav8qi", CODE_FOR_faligndatav8qi_vis,
> +	       v8qi_ftype_v8qi_v8qi);
> +  def_builtin ("__builtin_vis_faligndatav2si", CODE_FOR_faligndatav2si_vis,
> +	       v2si_ftype_v2si_v2si);
> 
> I think you should define __builtin_vis_faligndatadi too.

 Ok.

> 
> +  mode[arg_count] = tmode;
> +  op[arg_count] = target;
> +
> +  if (icode == CODE_FOR_alignaddrsi_vis && Pmode == DImode)
> +    icode = CODE_FOR_alignaddrdi_vis;
> +
> +  if (target == 0
> +      || GET_MODE (target) != tmode
> +      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> +    op[arg_count] = gen_reg_rtx (tmode);
> 
> Please use an 'else' so that op[arg_count] is not written twice.

 Ok.

> I see you don't generate the move if the target is not NULL but not used as 
> op[0].  I presume it's the correct behaviour for TARGET_EXPAND_BUILTIN?
> 
> 
> +  for (; arglist; arglist = TREE_CHAIN (arglist))
> +    {
> +      tree arg =  TREE_VALUE (arglist);
> 
> I personally don't like very much 'for' loops without initializer.  Please 
> move the initialization of 'arglist' to here.

 I've fixed this for the next patch.

> 
> +      gcc_assert (arg_count < 4);
> 
> +  switch (arg_count)
> +    {
> [...]
> +    default:
> +      gcc_unreachable ();
> +    }
> 
> I think this is redundant.  I would remove the assert.

 Except the assert ensures we don't write over random memory.  That
doesn't really matter since we would abort anyway.

> 
> +
> +;; Hard to generate VIS instructions.  We have builtins for these
> 
> Missing period and newline (it's a multi-insn comment).

 Ok.

> +;; It may be possible to describe this operation as (1 indexed):
> +;; (vec_select (vec_duplicate (vec_duplicate (vec_concat 1 2)))
> +;;  1,5,10,14,19,23,28,32)
> +;; However (vec_merge:V8QI [(V4QI) (V4QI)] (10101010 = 170) doesn't work!
> +(define_insn "fpmerge_vis"
> +  [(set (match_operand:V8QI 0 "register_operand" "=e")
> +        (unspec:V8QI [(match_operand:V4QI 1 "register_operand" "f")
> +                      (match_operand:V4QI 2 "register_operand" "f")]
> +         UNSPEC_FPMERGE))]
> + "TARGET_VIS"
> + "fpmerge\t%1, %2, %0"
> + [(set_attr "type" "fga")
> +  (set_attr "fptype" "double")])
> 
> It could be nice to write the modes in the (vec_select (...)) expression.  And 
> I don't understand the 'However': is there a relationship between the two 
> remarks?  If no, I think that 'Note that' would be better.

 There is no relationship.  I over use the word 'however'.

> I presume vec_merge wants the same mode for all its operands?

 Yes.  I've updated the comment.

> +;; Only one of the following two insns can be a multiply.
> +(define_insn "fmul8x16au_vis"
> +  [(set (match_operand:V4HI 0 "register_operand" "=e")
> +        (mult:V4HI (match_operand:V4QI 1 "register_operand" "f")
> +                   (match_operand:V2HI 2 "register_operand" "f")))]
> +  "TARGET_VIS"
> +  "fmul8x16au\t%1, %2, %0"
> +  [(set_attr "type" "fpmul")
> +   (set_attr "fptype" "double")])
> +
> +(define_insn "fmul8x16al_vis"
> +  [(set (match_operand:V4HI 0 "register_operand" "=e")
> +        (unspec:V4HI [(match_operand:V4QI 1 "register_operand" "f")
> +                      (match_operand:V2HI 2 "register_operand" "f")]
> +         UNSPEC_MUL16AL))]
> +  "TARGET_VIS"
> +  "fmul8x16al\t%1, %2, %0"
> +  [(set_attr "type" "fpmul")
> +   (set_attr "fptype" "double")])
> 
> How did you chose which one deserves to be modelled as a multiply?  And does 
> it make sense to model it as a multiply in the first place?  Same for the 
> other 2 pairs.

 The first one.  From the testing I've done I've noticed that multiplying by
0 returns zero, but multiplying by non-zero, e.g. 1, generates a fmul8?x16?
instruction, which is what we want.

> 
> +;; This probably isn't exactly safe for DF mode.
> +;; In fact, it's not meant for anything other than V8QI.  However, since a
> +;; short* or int* will be aligned along 2 byte and 4 byte boundaries anyway,
> +;; this will work for V4HI and V2SI as well.
> +(define_insn "faligndata<V64:mode>_vis"
> +  [(set (match_operand:V64 0 "register_operand" "=e")
> +        (unspec:V64 [(match_operand:V64 1 "register_operand" "e")
> +                     (match_operand:V64 2 "register_operand" "e")]
> +         UNSPEC_ALIGNDATA))]
> +  "TARGET_VIS"
> +  "faligndata\t%1, %2, %0"
> +  [(set_attr "type" "fga")
> +   (set_attr "fptype" "double")])
> 
> I think you should use V64I here, this would eliminate the DFmode problem.  
> And I don't really understand the second part of the comment: why do you 
> think the alignment of submode should be taken into account ?

 The thing is faligndata uses part of the gsr to align select which parts of
%1 and %2 are put into %0.  So if someone where to do something crazy like
align the address of a pointer that is 3 bytes off an 8 byte boundary then 
load two V4HI vectors and try to align those two vectors, the result would
not be what was expected.
 
> 
> +(define_mode_macro P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
> +
> +(define_insn "alignaddr<P:mode>_vis"
> +  [(set (match_operand:P 0 "register_operand" "=r")
> +        (unspec:P [(match_operand:P 1 "reg_or_0_operand" "%rJ")
> +                   (match_operand:P 2 "register_operand" "r")]
> +         UNSPEC_ALIGNADDR))]
> +  "TARGET_VIS"
> +  "alignaddr\t%1, %2, %0")
> 
> I think the insn is fully symmetric wrt to its 2 operands so you should write 
> a fully symmetric pattern here.  You also need to use '%r' for operands that 
> might be zero (see sparc.c:print_operand).

 Are you saying both operands should be:
(match_operand:P ? "reg_or_0_operand" "%r")

> It occured to me that we could use alignaddr/faligndata to implement 
> byte-aligned load operations in 64-bit mode (see pages 70 and 71 in the VIS 
> manual).  It turns out that this would exactly match RTH's unaligned load 
> patch http://gcc.gnu.org/ml/gcc-patches/2004-11/msg00444.html when the 
> bitfield is byte-aligned (probably modulo signedness).

 Probably, we can also use them to fixed the vectorizer fails.  I don't know
how to do that yet though.

> 
> Btw, is it mandatory to have non *-prefixed patterns to have the corresponding 
> CODE_FOR_* symbols?  If no, I think all patterns should be prefixed with '*'.

 Yes, using non-* patterns makes the expander much simpler.
 
> -- 
> Eric Botcazou
> 

-- 
Thanks,
Jim

http://www.student.cs.uwaterloo.ca/~ja2morri/
http://phython.blogspot.com
http://open.nit.ca/wiki/?page=jim


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]