This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst


On Jun 1, 2018, at 10:11 AM, Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
> 
> On Fri, 2018-06-01 at 08:53 +0200, Richard Biener wrote:
>> On Thu, May 31, 2018 at 9:59 PM Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
>>> 
>>> Hi,
>>>  Add support for gimple folding for unaligned vector loads and stores.
>>> testcases posted separately in this thread.
>>> 
>>> Regtest completed across variety of systems, P6,P7,P8,P9.
>>> 
>>> OK for trunk?
>>> Thanks,
>>> -Will
>>> 
>>> [gcc]
>>> 
>>> 2018-05-31 Will Schmidt <will_schmidt@vnet.ibm.com>
>>> 
>>>        * config/rs6000/rs6000.c: (rs6000_builtin_valid_without_lhs) Add vec_xst
>>>        variants to the list.  (rs6000_gimple_fold_builtin) Add support for
>>>        folding unaligned vector loads and stores.
>>> 
>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>>> index d62abdf..54b7de2 100644
>>> --- a/gcc/config/rs6000/rs6000.c
>>> +++ b/gcc/config/rs6000/rs6000.c
>>> @@ -15360,10 +15360,16 @@ rs6000_builtin_valid_without_lhs (enum rs6000_builtins fn_code)
>>>     case ALTIVEC_BUILTIN_STVX_V8HI:
>>>     case ALTIVEC_BUILTIN_STVX_V4SI:
>>>     case ALTIVEC_BUILTIN_STVX_V4SF:
>>>     case ALTIVEC_BUILTIN_STVX_V2DI:
>>>     case ALTIVEC_BUILTIN_STVX_V2DF:
>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>>       return true;
>>>     default:
>>>       return false;
>>>     }
>>> }
>>> @@ -15869,10 +15875,77 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>>        gimple_set_location (g, loc);
>>>        gsi_replace (gsi, g, true);
>>>        return true;
>>>       }
>>> 
>>> +    /* unaligned Vector loads.  */
>>> +    case VSX_BUILTIN_LXVW4X_V16QI:
>>> +    case VSX_BUILTIN_LXVW4X_V8HI:
>>> +    case VSX_BUILTIN_LXVW4X_V4SF:
>>> +    case VSX_BUILTIN_LXVW4X_V4SI:
>>> +    case VSX_BUILTIN_LXVD2X_V2DF:
>>> +    case VSX_BUILTIN_LXVD2X_V2DI:
>>> +      {
>>> +        arg0 = gimple_call_arg (stmt, 0);  // offset
>>> +        arg1 = gimple_call_arg (stmt, 1);  // address
>>> +        lhs = gimple_call_lhs (stmt);
>>> +        location_t loc = gimple_location (stmt);
>>> +        /* Since arg1 may be cast to a different type, just use ptr_type_node
>>> +           here instead of trying to enforce TBAA on pointer types.  */
>>> +        tree arg1_type = ptr_type_node;
>>> +        tree lhs_type = TREE_TYPE (lhs);
>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
>>> +           the tree using the value from arg0.  The resulting type will match
>>> +           the type of arg1.  */
>>> +        gimple_seq stmts = NULL;
>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg0);
>>> +        tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
>>> +                                      arg1_type, arg1, temp_offset);
>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>> +        /* Use the build2 helper to set up the mem_ref.  The MEM_REF could also
>>> +           take an offset, but since we've already incorporated the offset
>>> +           above, here we just pass in a zero.  */
>>> +        gimple *g;
>>> +        g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type, temp_addr,
>>> +                                               build_int_cst (arg1_type, 0)));
>> 
>> So in GIMPLE the type of the MEM_REF specifies the alignment so my question
>> is what type does the lhs usually have here?  I'd simply guess V4SF, etc.?  In
> 
> yes.  (double-checking).  my reference for the intrinsic signatures
> shows the lhs is a vector of type.  The rhs can be either *type or
> *vector of type. 
> 
> vector double vec_vsx_ld (int, const vector double *);
> vector double vec_vsx_ld (int, const double *);
> With similar/same for the assorted other types.
> 
> These are also on my list as 'unaligned' vector loads.  I'm not certain
> if that adds a twist to how I should answer the below.. 
> 
> Bill?

'unaligned' means not necessarily aligned on a vector boundary.
They are guaranteed to be aligned on an element boundary.
> 
>> this case you are missing a
>>  tree ltype = build_aligned_type (lhs_type, desired-alignment);
>> 
>> and use that ltype for building the MEM_REF.  I suppose in this case the known
>> alignment is either BITS_PER_UNIT or element alignment (thus
>> TYPE_ALIGN (TREE_TYPE (lhs_type)))?
> 
> I'd think element alignment.  but no longer certain.  :-)

Yep, element alignment.

Thanks,
Bill
> 
>> Or is the type of the load the element types?
> 
> 
> So, In any case..  I'll build up / modify some tests to look at data
> being loaded, and see if I can see alignment issues here.
> 
> Thanks,
> -Will 
> 
> 
> 
>> Richard.
>> 
>>> +        gimple_set_location (g, loc);
>>> +        gsi_replace (gsi, g, true);
>>> +        return true;
>>> +      }
>>> +
>>> +    /* unaligned Vector stores.  */
>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>> +      {
>>> +        arg0 = gimple_call_arg (stmt, 0); /* Value to be stored.  */
>>> +        arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
>>> +        tree arg2 = gimple_call_arg (stmt, 2); /* Store-to address.  */
>>> +        location_t loc = gimple_location (stmt);
>>> +        tree arg0_type = TREE_TYPE (arg0);
>>> +        /* Use ptr_type_node (no TBAA) for the arg2_type.  */
>>> +        tree arg2_type = ptr_type_node;
>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
>>> +           the tree using the value from arg0.  The resulting type will match
>>> +           the type of arg2.  */
>>> +        gimple_seq stmts = NULL;
>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg1);
>>> +        tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
>>> +                                      arg2_type, arg2, temp_offset);
>>> +        /* Mask off any lower bits from the address.  */
>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>> +        gimple *g;
>>> +        g = gimple_build_assign (build2 (MEM_REF, arg0_type, temp_addr,
>>> +                                          build_int_cst (arg2_type, 0)), arg0);
>>> +        gimple_set_location (g, loc);
>>> +        gsi_replace (gsi, g, true);
>>> +        return true;
>>> +      }
>>> +
>>>     /* Vector Fused multiply-add (fma).  */
>>>     case ALTIVEC_BUILTIN_VMADDFP:
>>>     case VSX_BUILTIN_XVMADDDP:
>>>     case ALTIVEC_BUILTIN_VMLADDUHM:
>>>       {


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]