This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] lno branch merge -- vectorizer patch #4


> I was thinking that the DECL for the builtin might be placed in

....where?

> (2) Force the target to use AND-style addressing, like Alpha.  So
>
>            op0 = expand_expr (exp1, NULL_RTX, VOIDmode, EXPAND_SUM);
>            op0 = memory_address (mode, op0);
>
>            if (code == ALIGN_INDIRECT_REF)
>              {
>                int align = TYPE_ALIGN (type);
>                op0 = gen_rtx_AND (Pmode, op0, GEN_INT (-align));
>                op0 = memory_address (mode, op0);
>              }

so with this approach the vectorizer would assume that ALIGN_INDIRECT_REF
is always supported, and the expander will always generate the AND
sequence. We may need to add something later in the compiler to get rid of
that sequence if it's not needed (in combine?)

> What I think makes most sense is to add a target hook,
>
>            targetm.vectorize.misaligned_mem_ok (mode)
>
> which says that, STRICT_ALIGNMENT or not, the normal "mov<mode>"
> pattern is prepared to look at MEM_ALIGN of any operand that is
> a memory, and fix things up if it doesn't work.
>
> So, for instance, i386 MMX move patterns don't have to change at
> all, since the hardware handles it.  But the SSE move patters get
> adjusted to emit the couple of instructions needed.
>
> The default version of this hook could be just !STRICT_ALIGNMENT.

so in expand_expr_real_1, MISALIGNED_INDIRECT_REF would be treated exactly
like a regular INDIRECT_REF except we would also have:
             if (code == MISALIGNED_INDIRECT_REF
                 && !targetm.vectorize.misaligned_mem_ok)
               abort ();
right?


thanks,

dorit




                                                                                                                                      
                      Richard Henderson                                                                                               
                      <rth@redhat.com>          To:       Dorit Naishlos/Haifa/IBM@IBMIL                                              
                      Sent by:                  cc:       gcc-patches@gcc.gnu.org, mark@codesourcery.com, Ayal Zaks/Haifa/IBM@IBMIL   
                      gcc-patches-owner@        Subject:  Re: [patch] lno branch merge -- vectorizer patch #4                         
                      gcc.gnu.org                                                                                                     
                                                                                                                                      
                                                                                                                                      
                      16/09/2004 06:21                                                                                                
                                                                                                                                      




On Wed, Sep 15, 2004 at 12:06:14PM +0300, Dorit Naishlos wrote:
> + DEF_PRIMITIVE_TYPE (BT_CHAR_VECTOR, (*build_vector_type)
(char_type_node, 16))

Hardcoding 16?  I think not.  I don't think you want any changes
to this file at all, in fact.

> +   /* Argument 0 is an address.  */
> +   if (!validate_arglist (arglist, POINTER_TYPE, 0))
> +     return const0_rtx;

Returning const0_rtx is success.  NULL is failure.

> + #ifdef HAVE_build_vector_mask_for_load
> + #ifdef HAVE_build_cc_mask_for_load

Anyway, I specifically said that the builtin should be target
specific, so that it would have the proper target specific
types associated with it.  Which immediately takes care of the
BT_CHAR_VECTOR problem mentioned above.  All this does mean
that you can't define it at all at the generic level.

I was thinking that the DECL for the builtin might be placed in

> --- 1548,1557 ----
>
>     /* We can set the alignment from the type if we are making an object,
>        this is an INDIRECT_REF, or if TYPE_ALIGN_OK.  */
> !   if (objectp || TREE_CODE (t) == INDIRECT_REF
> !       || TREE_CODE (t) == MISALIGNED_INDIRECT_REF
> !       || TREE_CODE (t) == ALIGN_INDIRECT_REF
> !       || TYPE_ALIGN_OK (type))
>       align = MAX (align, TYPE_ALIGN (type));

Except that you can't for MISALIGNED_INDIRECT_REF, since that's
explicitly *not* aligned.  The alignment of the memory should be
taken from that stored in MISALIGNED_INDIRECT_REF.

For ALIGN_INDIRECT_REF, we need to force EXPR and OFFSET to NULL,
since we don't know exactly what we're overlapping.  This sort
of thing is currently cleared explicitly in the Alpha backend
when we generate AND memory addresses.

> *************** expand_expr_real_1 (tree exp, rtx target
> *** 6755,6760 ****
> --- 6762,6771 ----
>
>            op0 = expand_expr (exp1, NULL_RTX, VOIDmode, EXPAND_SUM);
>            op0 = memory_address (mode, op0);
> +
> +          if (code == ALIGN_INDIRECT_REF)
> +            op0 = expand_addr_floor_op (mode, op0);

expand_addr_floor_op is broken.  It's written to assume loads only.
I don't think the optabs model fits with this code at all.  Given
how expand_expr works, and how targets will use this, I think that
we have two options:

 (1) A target hook

             op0 = targetm.vectorize.expand_addr_floor (op0)

     which manipulates the address of the mem.

 (2) Force the target to use AND-style addressing, like Alpha.  So

             op0 = expand_expr (exp1, NULL_RTX, VOIDmode, EXPAND_SUM);
             op0 = memory_address (mode, op0);

             if (code == ALIGN_INDIRECT_REF)
               {
                 int align = TYPE_ALIGN (type);
                 op0 = gen_rtx_AND (Pmode, op0, GEN_INT (-align));
                 op0 = memory_address (mode, op0);
               }

Actually, I think that (2) makes the most sense.  Otherwise you
have to come up with some other representation that will mean
exactly the same thing.

> +          if (code == MISALIGNED_INDIRECT_REF)
> +            {
> +              op1 = expand_expr (TREE_OPERAND (exp, 1), NULL_RTX,
VOIDmode, 0);
> +              temp = expand_addr_misaligned_op (mode, temp, op1);
> +            }

Similarly expand_addr_misaligned_op.  Except that here we don't
have a good way to fix things up in a generic way.  If we were
to use optabs, we'd only be able to do something useful by
doing a complete replacement for a move pattern.

What I think makes most sense is to add a target hook,

             targetm.vectorize.misaligned_mem_ok (mode)

which says that, STRICT_ALIGNMENT or not, the normal "mov<mode>"
pattern is prepared to look at MEM_ALIGN of any operand that is
a memory, and fix things up if it doesn't work.

So, for instance, i386 MMX move patterns don't have to change at
all, since the hardware handles it.  But the SSE move patters get
adjusted to emit the couple of instructions needed.

The default version of this hook could be just !STRICT_ALIGNMENT.

> +     case REALIGN_LOAD_EXPR:
> +       {
> +         tree oprnd0 = TREE_OPERAND (exp, 0);
> +         tree oprnd1 = TREE_OPERAND (exp, 1);
> +         tree oprnd2 = TREE_OPERAND (exp, 2);
> +         rtx op2;
> +
> +         this_optab = optab_for_tree_code (code, type);
> +         if (this_optab->handlers[(int) mode].insn_code ==
CODE_FOR_nothing)
> +           abort ();

You're asserting this in optabs.c as well.  Which is the more
correct place.

> !   if (TREE_CODE (t) == INDIRECT_REF
> !       || TREE_CODE (t) == MISALIGNED_INDIRECT_REF
> !       /* Note: doesn't apply to ALIGN_INDIRECT_REF */)

Note that the comment goes before the if, not in the middle.

> + rtx
> + expand_realign_op (enum machine_mode mode, optab realign_optab,
> +                    rtx op0, rtx op1, rtx op2, rtx target, int
unsignedp)

Is there anything realign specific here?  Perhaps better named
as expand_ternary_op?

> +  ... or -1 if the alignment is unknown.  */
> + DEFTREECODE (MISALIGNED_INDIRECT_REF, "misaligned_indirect_ref", 'r',
2)

Zero is a better unknown.

> + /* Extract elements from two input vectors Operand 0 and Operand 1 of
> size VS,
> +    according to an offset OFF as follows: the last VS - OFF elements of

> +    vector Operand0 are concatenated to the first OFF elements of vector
> +    Operand1.  If OFF==0, then the returned vector is Opernad1.
> +    OP2 may take different forms; It can be an address, in which case
> and its
> +    low log2(VS)-1 bits define the offset, or it can be mask of some
> form.  */

Hum, this documentation could really use improvement.  Even if limited
to "the mask is target dependent, generated by xyz".


r~




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]