This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Vector Comparison patch


On Thu, Aug 25, 2011 at 3:15 PM, Artem Shinkarov
<artyom.shinkaroff@gmail.com> wrote:
> On Thu, Aug 25, 2011 at 2:00 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Thu, Aug 25, 2011 at 2:45 PM, Artem Shinkarov
>> <artyom.shinkaroff@gmail.com> wrote:
>>> On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>>> On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov
>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>> On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov
>>>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>>>> Here is a cleaned-up patch without the hook. Mostly it works in a way
>>>>>>> we discussed.
>>>>>>>
>>>>>>> So I think it is a right time to do something about vcond patterns,
>>>>>>> which would allow me to get rid of conversions that I need to put all
>>>>>>> over the code.
>>>>>>>
>>>>>>> Also at the moment the patch breaks lto frontend with a simple example:
>>>>>>> #define vector(elcount, type) ?\
>>>>>>> __attribute__((vector_size((elcount)*sizeof(type)))) type
>>>>>>>
>>>>>>> int main (int argc, char *argv[]) {
>>>>>>> ? ?vector (4, float) f0;
>>>>>>> ? ?vector (4, float) f1;
>>>>>>>
>>>>>>> ? ?f0 = ?f1 != f0
>>>>>>> ? ? ? ? ?? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0};
>>>>>>>
>>>>>>> ? ?return (int)f0[argc];
>>>>>>> }
>>>>>>>
>>>>>>> test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244
>>>>>>>
>>>>>>> I looked into the file, the conversion function is defined as
>>>>>>> gcc_unreachable (). I am not very familiar with lto, so I don't really
>>>>>>> know what is the right way to treat the conversions.
>>>>>>>
>>>>>>> And I seriously need help with backend patterns.
>>>>>>
>>>>>> On the patch.
>>>>>>
>>>>>> The documentation needs review by a native english speaker, but here
>>>>>> are some factual comments:
>>>>>>
>>>>>> +In C vector comparison is supported within standard comparison operators:
>>>>>>
>>>>>> it should read 'In GNU C' here and everywhere else as this is a GNU
>>>>>> extension.
>>>>>>
>>>>>> ?The result of the
>>>>>> +comparison is a signed integer-type vector where the size of each
>>>>>> +element must be the same as the size of compared vectors element.
>>>>>>
>>>>>> The result type of the comparison is determined by the C frontend,
>>>>>> it isn't under control of the user. ?What you are implying here is
>>>>>> restrictions on vector assignments, which are documented elsewhere.
>>>>>> I'd just say
>>>>>>
>>>>>> 'The result of the comparison is a vector of the same width and number
>>>>>> of elements as the comparison operands with a signed integral element
>>>>>> type.'
>>>>>>
>>>>>> +In addition to the vector comparison C supports conditional expressions
>>>>>>
>>>>>> See above.
>>>>>>
>>>>>> +For the convenience condition in the vector conditional can be just a
>>>>>> +vector of signed integer type.
>>>>>>
>>>>>> 'of integer type.' ?I don't see a reason to disallow unsigned integers,
>>>>>> they can be equally well compared against zero.
>>>>>
>>>>> I'll have a final go on the documentation, it is untouched from the old patches.
>>>>>
>>>>>> Index: gcc/targhooks.h
>>>>>> ===================================================================
>>>>>> --- gcc/targhooks.h ? ? (revision 177665)
>>>>>> +++ gcc/targhooks.h ? ? (working copy)
>>>>>> @@ -86,6 +86,7 @@ extern int default_builtin_vectorization
>>>>>> ?extern tree default_builtin_reciprocal (unsigned int, bool, bool);
>>>>>>
>>>>>> ?extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
>>>>>> +
>>>>>> ?extern bool
>>>>>> ?default_builtin_support_vector_misalignment (enum machine_mode mode,
>>>>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const_tree,
>>>>>>
>>>>>> spurious whitespace change.
>>>>>
>>>>> Yes, thanks.
>>>>>
>>>>>> Index: gcc/optabs.c
>>>>>> ===================================================================
>>>>>> --- gcc/optabs.c ? ? ? ?(revision 177665)
>>>>>> +++ gcc/optabs.c ? ? ? ?(working copy)
>>>>>> @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type
>>>>>> ...
>>>>>> + ?else
>>>>>> + ? ?{
>>>>>> + ? ? ?rtx rtx_op0;
>>>>>> + ? ? ?rtx vec;
>>>>>> +
>>>>>> + ? ? ?rtx_op0 = expand_normal (op0);
>>>>>> + ? ? ?comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX);
>>>>>> + ? ? ?vec = CONST0_RTX (mode);
>>>>>> +
>>>>>> + ? ? ?create_output_operand (&ops[0], target, mode);
>>>>>> + ? ? ?create_input_operand (&ops[1], rtx_op1, mode);
>>>>>> + ? ? ?create_input_operand (&ops[2], rtx_op2, mode);
>>>>>> + ? ? ?create_input_operand (&ops[3], comparison, mode);
>>>>>> + ? ? ?create_input_operand (&ops[4], rtx_op0, mode);
>>>>>> + ? ? ?create_input_operand (&ops[5], vec, mode);
>>>>>>
>>>>>> this still builds the fake(?) != comparison, but as you said you need help
>>>>>> with the .md part if we want to use a machine specific pattern for this
>>>>>> case (which we eventually want, for the sake of using XOP vcond).
>>>>>
>>>>> Yes, I am waiting for it. This is the only way at the moment to make
>>>>> sure that in
>>>>> m = a > b;
>>>>> r = m ? c : d;
>>>>>
>>>>> m in the vcond is not transformed into the m != 0.
>>>>>
>>>>>> Index: gcc/target.h
>>>>>> ===================================================================
>>>>>> --- gcc/target.h ? ? ? ?(revision 177665)
>>>>>> +++ gcc/target.h ? ? ? ?(working copy)
>>>>>> @@ -51,6 +51,7 @@
>>>>>> ?#define GCC_TARGET_H
>>>>>>
>>>>>> ?#include "insn-modes.h"
>>>>>> +#include "gimple.h"
>>>>>>
>>>>>> ?#ifdef ENABLE_CHECKING
>>>>>>
>>>>>> spurious change.
>>>>>
>>>>> Old stuff, fixed.
>>>>>
>>>>>> @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr
>>>>>> ? ? ?floating-point, we can only do some of these simplifications.) ?*/
>>>>>> ? if (operand_equal_p (arg0, arg1, 0))
>>>>>> ? ? {
>>>>>> + ? ? ?tree arg0_type = TREE_TYPE (arg0);
>>>>>> +
>>>>>> ? ? ? switch (code)
>>>>>> ? ? ? ?{
>>>>>> ? ? ? ?case EQ_EXPR:
>>>>>> - ? ? ? ? if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
>>>>>> - ? ? ? ? ? ? || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))))
>>>>>> + ? ? ? ? if (! FLOAT_TYPE_P (arg0_type)
>>>>>> + ? ? ? ? ? ? || ! HONOR_NANS (TYPE_MODE (arg0_type)))
>>>>>> ...
>>>>>
>>>>> Ok.
>>>>>
>>>>>>
>>>>>> Likewise.
>>>>>>
>>>>>> @@ -8440,6 +8440,37 @@ expand_expr_real_2 (sepops ops, rtx targ
>>>>>> ? ? case UNGE_EXPR:
>>>>>> ? ? case UNEQ_EXPR:
>>>>>> ? ? case LTGT_EXPR:
>>>>>> + ? ? ?if (TREE_CODE (ops->type) == VECTOR_TYPE)
>>>>>> + ? ? ? {
>>>>>> + ? ? ? ? enum tree_code code = ops->code;
>>>>>> + ? ? ? ? tree arg0 = ops->op0;
>>>>>> + ? ? ? ? tree arg1 = ops->op1;
>>>>>>
>>>>>> move this code to do_store_flag (we really store a flag value). ?It should
>>>>>> also simply do what expand_vec_cond_expr does, probably simply
>>>>>> calling that with the {-1,...} {0,...} extra args should work.
>>>>>
>>>>> I started to do that, but the code in do_store_flag is completely
>>>>> different from what I am doing, and it looks confusing. I just call
>>>>> expand_vec_cond_expr and that is it. I can write a separate function,
>>>>> but the code is quite small.
>>>>
>>>> Hm? ?I see in your patch
>>>>
>>>> Index: gcc/expr.c
>>>> ===================================================================
>>>> --- gcc/expr.c ?(revision 177665)
>>>> +++ gcc/expr.c ?(working copy)
>>>> @@ -8440,6 +8440,37 @@ expand_expr_real_2 (sepops ops, rtx targ
>>>> ? ? case UNGE_EXPR:
>>>> ? ? case UNEQ_EXPR:
>>>> ? ? case LTGT_EXPR:
>>>> + ? ? ?if (TREE_CODE (ops->type) == VECTOR_TYPE)
>>>> + ? ? ? {
>>>> + ? ? ? ? enum tree_code code = ops->code;
>>>> + ? ? ? ? tree arg0 = ops->op0;
>>>> + ? ? ? ? tree arg1 = ops->op1;
>>>> + ? ? ? ? tree arg_type = TREE_TYPE (arg0);
>>>> + ? ? ? ? tree el_type = TREE_TYPE (arg_type);
>>>> + ? ? ? ? tree t, ifexp, if_true, if_false;
>>>> +
>>>> + ? ? ? ? el_type = lang_hooks.types.type_for_size (TYPE_PRECISION
>>>> (el_type), 0);
>>>> +
>>>> +
>>>> + ? ? ? ? ifexp = build2 (code, type, arg0, arg1);
>>>> + ? ? ? ? if_true = build_vector_from_val (type, build_int_cst (el_type, -1));
>>>> + ? ? ? ? if_false = build_vector_from_val (type, build_int_cst (el_type, 0));
>>>> +
>>>> + ? ? ? ? if (arg_type != type)
>>>> + ? ? ? ? ? {
>>>> + ? ? ? ? ? ? if_true = convert (arg_type, if_true);
>>>> + ? ? ? ? ? ? if_false = convert (arg_type, if_false);
>>>> + ? ? ? ? ? ? t = build3 (VEC_COND_EXPR, arg_type, ifexp, if_true, if_false);
>>>> + ? ? ? ? ? ? t = convert (type, t);
>>>> + ? ? ? ? ? }
>>>> + ? ? ? ? else
>>>> + ? ? ? ? ? t = build3 (VEC_COND_EXPR, type, ifexp, if_true, if_false);
>>>> +
>>>> + ? ? ? ? return expand_expr (t,
>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? modifier != EXPAND_STACK_PARM ? target :
>>>> NULL_RTX,
>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? tmode != VOIDmode ? tmode : mode,
>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? modifier);
>>>> + ? ? ? }
>>>>
>>>> that's not exactly "calling expand_vec_cond_expr".
>>>
>>> Well, actually it is. Keep in mind that clean backend would imply
>>> removing the conversions. But I'll make a function.
>>
>> Why does
>>
>> ?return expand_vec_cond_expr (build2 (ops->code, type, ops->op0, ops->op1),
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? build_vector_from_val
>> (type, build_int_cst (el_type, -1)),
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? build_vector_from_val
>> (type, build_int_cst (el_type, 0)));
>>
>> not work? ?If you push the conversions to expand_vec_cond_expr
>> by doing them on RTL you simplify things here and remove the requirement
>> from doing them in the C frontend for VEC_COND_EXPR as well.
>
> It does not work because vcond <a > b, c, d> requires a,b,c,d to have
> the same type. Now here we are dealing only with comparisons, so in
> case of floats we have vcond < f0 > f1, {-1,...}, {0,...}> which we
> have to transform into
> (vsi)(vcond< f0 >f1, (vsf){-1,...}, (vsf){0,...}>).
>
> Ok, so is it ok to do make this conversion here for the real types?
>
>>>>>>
>>>>>> As for the still required conversions, you should be able to delay those
>>>>>> from the C frontend (and here) to expand_vec_cond_expr by, after
>>>>>> expanding op1 and op2, wrapping a subreg around it with a proper mode
>>>>>> (using convert_mode (GET_MODE (comparison), rtx_op1)) should work),
>>>>>> and then convert the result back to the original mode.
>>>>>>
>>>>>> I'll leave the C frontend pieces of the patch for review by Joseph, but
>>>>>
>>>>> Conversions are there until we fix the backend. When backend will be
>>>>> able to digest f0 > f1 ? int0 : int1, all the conversions will go
>>>>> away.
>>>>>
>>>>>> +static tree
>>>>>> +fold_build_vec_cond_expr (tree ifexp, tree op1, tree op2)
>>>>>>
>>>>>> is missing a function comment.
>>>>>
>>>>> fixed.
>>>>>
>>>>>> +static tree
>>>>>> +do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
>>>>>> + ? ? ? ? tree bitpos, tree bitsize, enum tree_code code)
>>>>>> +{
>>>>>> + ?tree cond;
>>>>>> + ?tree comp_type;
>>>>>> +
>>>>>> + ?a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
>>>>>> + ?b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
>>>>>> +
>>>>>> + ?comp_type = lang_hooks.types.type_for_size (TYPE_PRECISION (inner_type), 0);
>>>>>> +
>>>>>>
>>>>>> Use
>>>>>>
>>>>>> ?comp_type = build_nonstandard_integer_type (TYPE_PRECISION (inner_type), 0);
>>>>>>
>>>>>> instead. ?But I think you don't want to use TYPE_PRECISION on
>>>>>> FP types. ?Instead you want a signed integer type of the same (mode)
>>>>>> size as the vector element type, thus
>>>>>>
>>>>>> ?comp_type = build_nonstandard_integer_type (GET_MODE_BITSIZE
>>>>>> (TYPE_MODE (inner_type)), 0);
>>>>>
>>>>> Hm, I thought that at this stage we don't wan to know anything about
>>>>> modes. I mean here I am really building the same integer type as the
>>>>> operands of the comparison have. But I can use MODE_BITSIZE as well, I
>>>>> don't think that it could happen that the size of the mode is
>>>>> different from the size of the type. Or could it?
>>>>
>>>> The comparison could be on floating-point types where TYPE_PRECISION
>>>> can be, for example, 80 for x87 doubles. ?You want an integer type
>>>> of the same width, so yes, GET_MODE_BITSIZE is the correct thing
>>>> to use here.
>>>
>>> Ok.
>>>
>>>>>> + ?cond = gimplify_build2 (gsi, code, comp_type, a, b);
>>>>>>
>>>>>> the result type of a comparison is boolean_type_node, not comp_type.
>>>>>>
>>>>>> + ?cond = gimplify_build2 (gsi, code, comp_type, a, b);
>>>>>> + ?return gimplify_build3 (gsi, COND_EXPR, comp_type, cond,
>>>>>> + ? ? ? ? ? ? ? ? ? ?build_int_cst (comp_type, -1),
>>>>>> + ? ? ? ? ? ? ? ? ? ?build_int_cst (comp_type, 0));
>>>>>>
>>>>>> writing this as
>>>>>>
>>>>>> + ?return gimplify_build3 (gsi, COND_EXPR, comp_type,
>>>>>> ? ? ? ? ? ? ? ? ? ? fold_build2 (code, boolean_type_node, a, b),
>>>>>> + ? ? ? ? ? ? ? ? ? ?build_int_cst (comp_type, -1),
>>>>>> + ? ? ? ? ? ? ? ? ? ?build_int_cst (comp_type, 0));
>>>>>>
>>>>>> will get the gimplifier a better chance at simplifcation.
>>>>>>
>>>>>> + ?if (! expand_vec_cond_expr_p (type, TYPE_MODE (type)))
>>>>>>
>>>>>> I think we are expecting the scalar type and the vector mode here
>>>>>> from looking at the single existing caller. ?It probably doesn't make
>>>>>> a difference (we only check TYPE_UNSIGNED of it, which should
>>>>>> also work for vector types), but let's be consistent. ?Thus,
>>>>>
>>>>> Ok.
>>>>>
>>>>>> ? ?if (! expand_vec_cond_expr_p (TREE_TYPE (type), TYPE_MODE (type)))
>>>>>>
>>>>>> + ?if (! expand_vec_cond_expr_p (type, TYPE_MODE (type)))
>>>>>> + ? ?t = expand_vector_piecewise (gsi, do_compare, type,
>>>>>> + ? ? ? ? ? ? ? ? ? ?TREE_TYPE (TREE_TYPE (op0)), op0, op1, code);
>>>>>> + ?else
>>>>>> + ? ?t = gimplify_build2 ?(gsi, code, type, op0, op1);
>>>>>>
>>>>>> the else case looks odd. ?Why re-build a stmt that already exists?
>>>>>> Simply return NULL_TREE instead?
>>>>>
>>>>> I can adjust. The reason it is written that way is that
>>>>> expand_vector_operations_1 is using the result of the function to
>>>>> update rhs.
>>>>
>>>> Ok, so it should check whether there was any lowering done then.
>>>>
>>>>>> +static tree
>>>>>> +expand_vec_cond_expr_piecewise (gimple_stmt_iterator *gsi, tree exp)
>>>>>> +{
>>>>>> ...
>>>>>> + ? ? ?/* Expand vector condition inside of VEC_COND_EXPR. ?*/
>>>>>> + ? ? ?if (! expand_vec_cond_expr_p (TREE_TYPE (cond),
>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TYPE_MODE (TREE_TYPE (cond))))
>>>>>> + ? ? ? {
>>>>>> ...
>>>>>> + ? ? ? ? new_rhs = expand_vector_piecewise (gsi, do_compare,
>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?TREE_TYPE (cond),
>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?TREE_TYPE (TREE_TYPE (op1)),
>>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?op0, op1, TREE_CODE (cond));
>>>>>>
>>>>>> I'm not sure it is beneficial to expand a < b ? v0 : v1 to
>>>>>>
>>>>>> tem = { a[0] < b[0] ? -1 : 0, ... }
>>>>>> v0 & tem | v1 & ~tem;
>>>>>>
>>>>>> instead of
>>>>>>
>>>>>> { a[0] < b[0] ? v0[0] : v1[0], ... }
>>>>>>
>>>>>> even if the bitwise operations could be carried out using vectors.
>>>>>> It's definitely beneficial to do the first if the CPU can create the
>>>>>> bitmask.
>>>>>>
>>>>>
>>>>> o_O
>>>>>
>>>>> I thought you always wanted to do (m & v0) | (~m & v1).
>>>>> Do you want to have two cases of the expansion then -- when we have
>>>>> mask available and when we don't? But it is really unlikely that we
>>>>> can get the mask, but cannot get vcond. Because condition is actually
>>>>> vcond. So once again -- do we always expand to {a[0] > b[0] ?? v[0] :
>>>>> c[0], ...}?
>>>>
>>>> Hm, yeah. ?I suppose with the current setup it's hard to only
>>>> get the mask but not the full vcond ;) ?So it probably makes
>>>> sense to always expand to {a[0] > b[0] ?? v[0] :c[0],...} as
>>>> fallback. ?Sorry for the confusion ;)
>>>
>>> Ok.
>>>
>>>>>> + ?/* Run vecower on the expresisons we have introduced. ?*/
>>>>>> + ?for (; gsi_tmp.ptr != gsi->ptr; gsi_next (&gsi_tmp))
>>>>>> + ? ?expand_vector_operations_1 (&gsi_tmp);
>>>>>>
>>>>>> do not use gsi.ptr directly, use gsi_stmt (gsi_tm) != gsi_stmt (gsi)
>>>>>>
>>>>>> +static bool
>>>>>> +is_vector_comparison (gimple_stmt_iterator *gsi, tree expr)
>>>>>> +{
>>>>>>
>>>>>> This function is lacking a comment.
>>>>>>
>>>>>> @@ -450,11 +637,41 @@ expand_vector_operations_1 (gimple_stmt_
>>>>>> ...
>>>>>> + ? ? ?/* Try to get rid from the useless vector comparison
>>>>>> + ? ? ? ?x != {0,0,...} which is inserted by the typechecker. ?*/
>>>>>> + ? ? ?if (COMPARISON_CLASS_P (cond) && TREE_CODE (cond) == NE_EXPR)
>>>>>>
>>>>>> how and why? ?You simply drop that comparison - that doesn't look
>>>>>> correct. ?And in fact TREE_OPERAND (cond, 0) will never be a
>>>>>> comparison - that wouldn't be valid gimple. ?Please leave this
>>>>>> optimization to SSA based forward propagation (I can help you here
>>>>>> once the patch is in).
>>>>>
>>>>> No-no-no. This is the second part of avoiding
>>>>> m = a > b;
>>>>> r = m ? v0 : v1;
>>>>>
>>>>> to prevent m expansion to m != {0}.
>>>>>
>>>>> I do not _simply_ drop the comparison. I drop it only if
>>>>> is_vector_comparison returned true. It means that we can never get
>>>>> into the situation that we are dropping actually a comparison inserted
>>>>> by the user. But what I really want to achieve here is to drop the
>>>>> comparison that the frontend inserts every time when it sees an
>>>>> expression there.
>>>>>
>>>>> As I said earlier, tree forward propagation kicks only using -On, and
>>>>> I would really like to make sure that I can get rid of useless != {0}
>>>>> at any level.
>>>
>>>> Please don't. ?If the language extension forces a != 0 then it should
>>>> appear at -O0. ?The code is fishy anyway in the way it walks stmts
>>>> in is_vector_comparison. ?At least I don't like to see this optimization
>>>> done here for the sake of -O0 in this initial patch - you could try
>>>> arguing about it as a followup improvement (well, probably with not
>>>> much luck). ?-O0 is about compile-speed and debugging, doing
>>>> data-flow by walking stmts backward is slow.
>>>
>>> Ok, then I seriously don't see any motivation to support the
>>> VEC_COND_EXPR. The following code:
>>>
>>> m = a > b;
>>> r = (m & v0) | (~m & v1)
>>>
>>> gives me much more flexibility and ?control. What the VEC_COND_EXPR is
>>> good for? Syntactical sugar?
>>>
>>> How about throwing away all the VEC_COND_EXPR parts supporting only
>>> conditions (implicitly expressed using vconds)? If we would agree on
>>> implicit conversions for real types, then this is a functionality that
>>> perfectly satisfies my needs.
>>>
>>> I don't see any interest from the backend people and I cannot wait
>>> forever, so why don't we start with a simple thing?
>>
>> But the simple thing is already what the backend supports.
>>
>> Richard.
>>
>
> Well, it is not "what" it is "how" -- that is what we are discussing
> for three weeks already.
>
> Ok, so the question now is, whether it is fine to have conversions
> inside expand_expr_real_2? If we agree that it is ok to do, then I can
> adjust the patch.

Yes, it is ok to have them there, but preferably on RTL and preferably
in expand_vec_cond_expr by using simplify_gen_subreg, untested patch:

Index: optabs.c
===================================================================
--- optabs.c    (revision 178060)
+++ optabs.c    (working copy)
@@ -6664,16 +6664,20 @@ expand_vec_cond_expr (tree vec_cond_type

   comparison = vector_compare_rtx (op0, unsignedp, icode);
   rtx_op1 = expand_normal (op1);
+  rtx_op1 = simplify_gen_subreg (GET_MODE (comparison), rtx_op1,
+                                GET_MODE (rtx_op1), 0);
   rtx_op2 = expand_normal (op2);
+  rtx_op2 = simplify_gen_subreg (GET_MODE (comparison), rtx_op2,
+                                GET_MODE (rtx_op2), 0);

-  create_output_operand (&ops[0], target, mode);
-  create_input_operand (&ops[1], rtx_op1, mode);
-  create_input_operand (&ops[2], rtx_op2, mode);
+  create_output_operand (&ops[0], target, GET_MODE (comparison));
+  create_input_operand (&ops[1], rtx_op1, GET_MODE (comparison));
+  create_input_operand (&ops[2], rtx_op2, GET_MODE (comparison));
   create_fixed_operand (&ops[3], comparison);
   create_fixed_operand (&ops[4], XEXP (comparison, 0));
   create_fixed_operand (&ops[5], XEXP (comparison, 1));
   expand_insn (icode, 6, ops);
-  return ops[0].value;
+  return simplify_gen_subreg (mode, ops[0].value, GET_MODE (comparison), 0);
 }




>
> Artem.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]