[PATCH][PR66010] Don't take address of ap unless necessary
Richard Biener
rguenther@suse.de
Mon May 11 07:47:00 GMT 2015
On Fri, 8 May 2015, Tom de Vries wrote:
> Hi,
>
> this patch fixes PR66010.
>
>
> I.
>
> Consider this test-case, with a va_list passed from f2 to f2_1:
> ...
> #include <stdarg.h>
>
> inline int __attribute__((always_inline))
> f2_1 (va_list ap)
> {
> return va_arg (ap, int);
> }
>
> int
> f2 (int i, ...)
> {
> int res;
> va_list ap;
>
> va_start (ap, i);
> res = f2_1 (ap);
> va_end (ap);
>
> return res;
> }
> ...
>
> When compiling at -O2, before eline we have:
> ...
> f2_1 (struct * apD.1832)
> {
> intD.6 _3;
>
> # .MEM_2 = VDEF <.MEM_1(D)>
> # USE = anything
> # CLB = anything
> _3 = VA_ARG (&apD.1832, 0B);
>
> # VUSE <.MEM_2>
> return _3;
> }
>
> f2 (intD.6 iD.1835)
> {
> struct apD.1839[1];
> intD.6 resD.1838;
> intD.6 _6;
>
> # .MEM_2 = VDEF <.MEM_1(D)>
> # USE = anything
> # CLB = anything
> __builtin_va_startD.1030 (&apD.1839, 0);
>
> # .MEM_3 = VDEF <.MEM_2>
> # USE = anything
> # CLB = anything
> res_4 = f2_1D.1833 (&apD.1839);
>
> # .MEM_5 = VDEF <.MEM_3>
> # USE = anything
> # CLB = anything
> __builtin_va_endD.1029 (&apD.1839);
>
> _6 = res_4;
>
> # .MEM_7 = VDEF <.MEM_5>
> apD.1839 ={v} {CLOBBER};
>
> # VUSE <.MEM_7>
> return _6;
> }
> ...
>
> Because the va_list type is an array type:
> ...
> struct apD.1839[1];
> ...
>
> we're passing the location of the initial element:
> ...
> res_4 = f2_1D.1833 (&apD.1839);
> ...
>
> And the type of the parameter in f2_1 is accordingly a pointer to array
> element:
> ...
> f2_1 (struct * apD.1832)
> ...
>
> That means the address operator here is superfluous.
> ...
> _3 = VA_ARG (&apD.1832, 0B);
> ...
> Or, differently put, when we take the address of ap in va_start and va_end in
> f2, the result is a pointer to array element type.
> When we take the address of ap in f2_2, the result is a pointer to pointer to
> array element type.
>
> This extra indirection doesn't cause wrong code to be generated. The va_arg
> expansion code handles this correctly, thanks to the combination of:
> - an unconditional build_fold_indirect_ref in expand_ifn_va_arg_1 which
> removes
> an indirection, and
> - a fixup in gimplify_va_arg_internal that again adds an indirection in some
> cases.
>
> The call gets inlined, and before pass_stdarg we have:
> ...
> f2 (intD.6 iD.1835)
> {
> struct * apD.1849;
> struct apD.1839[1];
> intD.6 _6;
>
> # .MEM_2 = VDEF <.MEM_1(D)>
> # USE = nonlocal escaped
> # CLB = nonlocal escaped { D.1839 } (escaped)
> __builtin_va_startD.1030 (&apD.1839, 0);
>
> # .MEM_3 = VDEF <.MEM_2>
> apD.1849 = &apD.1839;
>
> # .MEM_7 = VDEF <.MEM_3>
> # USE = nonlocal null { D.1839 D.1849 } (escaped)
> # CLB = nonlocal null { D.1839 D.1849 } (escaped)
> _6 = VA_ARG (&apD.1849, 0B);
> ...
>
> And after expanding ifn_va_arg, we have:
> ...
> f2 (intD.6 iD.1835)
> {
> struct * ap.3D.1853;
> struct apD.1839[1];
>
> # .MEM_2 = VDEF <.MEM_1(D)>
> # USE = nonlocal escaped
> # CLB = nonlocal escaped { D.1839 } (escaped)
> __builtin_va_startD.1030 (&apD.1839, 0);
>
> ap_3 = &apD.1839;
>
> ap.3_10 = ap_3;
>
> # VUSE <.MEM_2>
> _11 = ap.3_10->gp_offsetD.2;
> <SNIP>
> ...
>
> The pass_stdarg optimization fails:
> ...
> f2: va_list escapes 1, needs to save all GPR units and all FPR units.
> ...
>
> It fails on the superfluous address operator:
> ...
> va_list escapes in ap_3 = &apD.1839;
> ...
>
>
> II.
>
> The patch prevents the superfluous address operator from being added.
>
> It also removes the need for the fixup in gimplify_va_arg_internal, by
> deciding in gimplify_va_arg_expr whether the build_fold_indirect_ref in
> expand_ifn_va_arg_1 is needed before passing ap on to
> gimplify_va_arg_internal. This decision is encoded as an extra argument to
> ifn_va_arg.
>
>
> III.
>
> Using the patch, before inlining we can see the address operator has been
> removed in va_arg:
> ...
> f2_1 (struct * apD.1832)
> {
> intD.6 _4;
>
> # .MEM_3 = VDEF <.MEM_1(D)>
> # USE = anything
> # CLB = anything
>
> _4 = VA_ARG (ap_2(D), 0B, 0);
> # VUSE <.MEM_3>
> return _4;
> }
> ...
>
> And after inlining, we see that va_start and va_arg now use the same ap:
> ...
> f2 (intD.6 iD.1835)
> {
> struct apD.1839[1];
>
> # .MEM_2 = VDEF <.MEM_1(D)>
> # USE = anything
> # CLB = anything
> __builtin_va_startD.1030 (&apD.1839, 0);
>
> # .MEM_3 = VDEF <.MEM_2>
> # USE = anything
> # CLB = anything
> _8 = VA_ARG (&apD.1839, 0B, 0);
> ...
>
> That allows the pass_stdarg optimization to succeed:
> ...
> f2: va_list escapes 0, needs to save 8 GPR units and 0 FPR units.
> ...
>
> Bootstrapped and reg-tested on x86_64, with and without -m32.
>
> OK for trunk?
>
> [ FWIW, I suspect this patch will make life easier for the reimplementation of
> the pass_stdarg optimization using ifn_va_arg. ]
+ if (canon_va_type != NULL)
+ {
+ if (!(TREE_CODE (canon_va_type) == ARRAY_TYPE
+ && TREE_CODE (va_type) != ARRAY_TYPE))
+ /* In gimplify_va_arg_expr we take the address of the ap argument,
mark
+ it addressable now. */
+ mark_addressable (expr);
can we "simplify" this and ...
- }
-
+ gcc_assert (TREE_CODE (TREE_TYPE (valist)) != ARRAY_TYPE);
gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
this to use [!]POINTER_TYPE_P ()? Why do we arrive with non-array
va_type but array canon_va_type in build_va_arg? I suppose the
va_list argument already decayed to a pointer then (in which case
the argument should already be addressable?)?
I think the overall idea of the patch is good - I'm just worried about
special-casing of ARRAY_TYPE vs. non-pointer-type (because it's the
latter that we ultimately want...).
Thanks,
Richard.
More information about the Gcc-patches
mailing list