This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: RFA (fold): PATCH for c++/49290 (folding *(T*)(ar+10))
On Jun 13, 2011, at 3:57 AM, Richard Guenther wrote:
> That's not exactly an example - I can't think of how you want or need
> to use VIEW_CONVERT_EXPRs to implement said divmod instruction or why
> you would need anything special for the _argument_ of said instruction.
Oh, I completely misunderstood your question. In my case, as I previously stated, was with a vector type that was identical, save the name of the type:
mod = a%b
where mod didn't have the type of the expression (a%b), so someone created the VIEW_CONVERT_EXPR on the mod. The person creating it _thought_ it would be a rvalue context, but ultimately, it was an lvalue context. We discover the lvalue/rvalue state of the expression at target_fold_builtin time. The actual code looks more like:
__builtin_divmod (div, mod, a, b);
In fold_builtin, we do all the processing to handle the semantics.
> An
> instruction or call with multiple outputs would simply be something
> like
>
> { div_1, mod_2 } = __builtin_divmod (arg_3);
>
> with two SSA defs. A nice representation for the tree for { div_1,
> mod_2 } remains to be found (if it should be a single tree at all, or
> possibly multiple ones).
At target_fold_builtin time we regenerate it as:
s = builtin_divmod_final (a, b);
div_1 = s.div
mod_2 = s.mod
and generate a type { div, mod } on the fly. We expect the optimizer to handle extra moves reasonably, and we want to keep the one instruction as one unit.
> We already play tricks for sincos for example via
>
> tem_1 = __builtin_cexpi (arg_2);
> sin_3 = REALPART_EXPR <tem_1>;
> cos_4 = IMAGPART_EXPR <tem_1>;
>
> which avoids the two defs by using a single def which is then decomposed.
>
> So, can you elaborate a bit more on what you want to do with special
> argument kinds? Elaborate with an actual example, not words.
We support tagging any parameter to a builtin as define_outputs, define_inputs or define_in_outs in a part of the .md file that describes the builtins for the machine, the actual divmod builtin for example is:
(define_builtin "divmod<T_ALL_DI:sign_u>" "divmod<T_ALL_DI:sign_u>_<type>"
[
(define_outputs [(var_operand:T_ALL_DI 0) ;;dividend
(var_operand:T_ALL_DI 1)]) ;;mod
(define_inputs [(var_operand:T_ALL_DI 2)
(var_operand:T_ALL_DI 3)])
(define_rtl_pattern "<T_ALL_DI:sign_u>divmod<m_mode>4" [0 1 2 3])
(attributes [pure])
]
)
that's the actual code. The testcase looks like:
t_v4udi_0 = divmodu_t_v4udi (t_v4udi_1, t_v4udi_2, t_v4udi_3);
The VIEW_CONVERT_EXPR looks like:
<view_convert_expr 0x7ffff5a872d0
type <vector_type 0x7ffff7f4b930 __attribute__((vector_size(32))) unsigned long
type <integer_type 0x7ffff7e8c690 long unsigned int public unsigned DI
size <integer_cst 0x7ffff7e76730 constant 64>
unit size <integer_cst 0x7ffff7e76758 constant 8>
align 64 symtab 0 alias set -1 canonical type 0x7ffff7e8c690 precision 64 min <integer_cst 0x7ffff7e76780 0> max\
<integer_cst 0x7ffff7e76708 18446744073709551615>
pointer_to_this <pointer_type 0x7ffff7f50738> reference_to_this <reference_type 0x7ffff7f513f0>>
unsigned V4DI
size <integer_cst 0x7ffff7f43de8 constant 256>
unit size <integer_cst 0x7ffff7e76348 constant 32>
align 256 symtab 0 alias set -1 canonical type 0x7ffff7f4b930 nunits 4 reference_to_this <reference_type 0x7ffff7f51\
540>>
arg 0 <var_decl 0x7ffff59d9640 t_v4udi_1
type <vector_type 0x7ffff5ac3888 type <integer_type 0x7ffff7e8c690 long unsigned int>
unsigned V4DI size <integer_cst 0x7ffff7f43de8 256> unit size <integer_cst 0x7ffff7e76348 32>
align 256 symtab 0 alias set -1 canonical type 0x7ffff5ac3888 nunits 4>
used public static unsigned V4DI defer-output file t22.c line 262 col 48 size <integer_cst 0x7ffff7f43de8 256> unit \
size <integer_cst 0x7ffff7e76348 32>
align 256>>
Hopefully, somewhere about is an example of what you wanted to see, if not, let me know what you'd like to see.