[patch] (4.1 stage 2 projects): vectorize reduction, part 3/n
Richard Henderson
rth@redhat.com
Sun Jun 19 17:25:00 GMT 2005
On Sun, Jun 19, 2005 at 07:41:06PM +0300, Dorit Naishlos wrote:
> + /* Generate insns for VEC_LSHIFT_EXPR, VEC_RSHIFT_EXPR.
> + FORNOW: Support only shift expressions in which the shift amount
> + is an immediate value. */
> +
> + rtx
> + expand_vec_shift_expr (tree vec_shift_expr, rtx target)
You've actually done all you need to support non-immediate values.
Remove the FORNOW.
> + /* CHECKME */
> + rtx_op1 = expand_expr (vec_oprnd, NULL_RTX, VOIDmode, 1);
> + rtx_op2 = expand_expr (shift_oprnd, NULL_RTX, VOIDmode, 1);
The final operand is not 1, but EXPAND_NORMAL.
> ! reduc_splus_optab = init_optab (UNKNOWN);
> ! reduc_uplus_optab = init_optab (UNKNOWN);
Refresh my memory as to why the signed and unsigned plus?
> ! /*** Case 3:
> ! Create: s = init;
> ! for (offset=0; offset<vector_size; offset+=element_size;)
> ! {
> ! Create: s' = extract_field <v_out2, offset>
> ! Create: s = op <s, s'>
> ! } */
> !
> ! if (vect_print_dump_info (REPORT_DETAILS, UNKNOWN_LOC))
> ! fprintf (vect_dump, "Reduce using scalar code. ");
> !
> ! new_temp = scalar_initial_def;
> ! vec_temp = PHI_RESULT (new_phi);
> ! vec_size_in_bits = tree_low_cst (TYPE_SIZE (vectype), 1);
> !
> ! for (bit_offset = 0;
> ! bit_offset < vec_size_in_bits;
> ! bit_offset += element_bitsize)
> ! {
> ! tree bitpos = bitsize_int (bit_offset);
> !
> ! epilog_stmt = build2 (MODIFY_EXPR, scalar_type, new_scalar_dest,
> ! build3 (BIT_FIELD_REF, scalar_type,
> ! vec_temp, bitsize, bitpos));
> ! new_name = make_ssa_name (new_scalar_dest, epilog_stmt);
> ! TREE_OPERAND (epilog_stmt, 0) = new_name;
> ! bsi_insert_after (&exit_bsi, epilog_stmt, BSI_NEW_STMT);
> ! if (vect_print_dump_info (REPORT_DETAILS, UNKNOWN_LOC))
> ! print_generic_expr (vect_dump, epilog_stmt, TDF_SLIM);
> !
> !
> ! epilog_stmt = build2 (MODIFY_EXPR, scalar_type, new_scalar_dest,
> ! build2 (code, scalar_type, new_name, new_temp));
> ! new_temp = make_ssa_name (new_scalar_dest, epilog_stmt);
> ! TREE_OPERAND (epilog_stmt, 0) = new_temp;
> ! bsi_insert_after (&exit_bsi, epilog_stmt, BSI_NEW_STMT);
> ! if (vect_print_dump_info (REPORT_DETAILS, UNKNOWN_LOC))
> ! print_generic_expr (vect_dump, epilog_stmt, TDF_SLIM);
> ! }
It would be better to extract the first two elements to begin,
rather than adding to scalar_initial_def. In the case of 2-wide
vectors, this results in one addition rather than two.
> ! ;; Vector shift left in bits. Currently supported ony for shift
> ! ;; amounts that can be expressed as byte shifts (divisible by 8).
> ! ;; General shift amounts can be supported using vslo + vsl. We're
> ! ;; not expecting to see these yet (the vectorizer currently
> ! ;; generates only shifts divisible by byte_size).
> ! (define_expand "vec_shli_<mode>"
> ! [(set (match_operand:V 0 "register_operand" "=v")
> ! (unspec:V [(match_operand:V 1 "register_operand" "v")
> ! (match_operand:QI 2 "immediate_operand" "i")] 219 ))]
In case Aldy is paying attention, a cast to TImode is a better
representation. See sse2_lshrti3.
> ! ;; todo:
> ! ;;(define_expand "reduc_uplus_v8hi"
Please remove.
r~
More information about the Gcc-patches
mailing list