This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, middle-end]: Fix mode-switching MODE_EXIT check with __builtin_apply/__builtin_return


Hello!

Vzeroupper placement patch uses MODE_EXIT to determine if vzeroupper
has to be placed before function exit. However, when compiling
following test

--cut here--
typedef struct objc_class *Class;
typedef struct objc_object
{
  Class class_pointer;
} *id;

typedef const struct objc_selector *SEL;
typedef void * retval_t;
typedef void * arglist_t;

extern retval_t __objc_forward (id object, SEL sel, arglist_t args);

double
__objc_double_forward (id rcv, SEL op, ...)
{
  void *args, *res;

  args = __builtin_apply_args ();
  res = __objc_forward (rcv, op, args);
  __builtin_return (res);
}
--cut here--

__builtin_return emits a sequence of loads from "argument area" to all
possible function return registers in their widest mode (together with
corresponding USE RTXes), creating:

(insn 31 30 33 2 (set (reg:DI 0 ax)
        (mem:DI (reg/v/f:DI 60 [ res ]) [0 S8 A8])) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 33 31 35 2 (set (reg:XF 8 st)
        (mem:XF (plus:DI (reg/v/f:DI 60 [ res ])
                (const_int 16 [0x10])) [0 S16 A8])) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 35 33 32 2 (set (reg:OI 21 xmm0)
        (mem:OI (plus:DI (reg/v/f:DI 60 [ res ])
                (const_int 32 [0x20])) [0 S32 A8])) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 32 35 34 2 (use (reg:DI 0 ax)) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 34 32 36 2 (use (reg:XF 8 st)) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 36 34 44 2 (use (reg:OI 21 xmm0)) avx-vzeroupper-27.c:23 -1
     (nil))

This sequence breaks assumption in mode-switching.c, that final
function return register load operates in MODE_EXIT mode and triggere
following code:

		    for (j = n_entities - 1; j >= 0; j--)
		      {
			int e = entity_map[j];
			int mode = MODE_NEEDED (e, return_copy);

			if (mode != num_modes[e] && mode != MODE_EXIT (e))
			  break;
		      }

As discussed above, modes of loads, generated from __builtin_apply
have no connection to function return mode. mode-switching.c does
detect __builtin_apply situation and raises maybe_builtin_apply flag,
but doesn't use it to short-circuit wrong check. In proposed patch, we
detect this situation and raise force_late_switch in the same way, as
SH4 does for its "late" fpscr emission. For the testcase above,
following RTX sequence is generated:

(insn 31 30 33 2 (set (reg:DI 0 ax)
        (mem:DI (reg/v/f:DI 60 [ res ]) [0 S8 A8]))
avx-vzeroupper-27.c:23 63 {*movdi_internal_rex64}
     (nil))
(insn 33 31 35 2 (set (reg:XF 8 st)
        (mem:XF (plus:DI (reg/v/f:DI 60 [ res ])
                (const_int 16 [0x10])) [0 S16 A8]))
avx-vzeroupper-27.c:23 106 {*movxf_internal}
     (nil))
(insn 35 33 52 2 (set (reg:OI 21 xmm0)
        (mem:OI (plus:DI (reg/v/f:DI 60 [ res ])
                (const_int 32 [0x20])) [0 S32 A8]))
avx-vzeroupper-27.c:23 60 {*movoi_internal_avx}
     (expr_list:REG_DEAD (reg/v/f:DI 60 [ res ])
        (nil)))
(insn 52 35 49 2 (unspec_volatile [
            (const_int 9 [0x9])
        ] UNSPECV_VZEROUPPER) -1
     (nil))
(note 49 52 32 2 NOTE_INSN_DELETED)
(insn 32 49 34 2 (use (reg:DI 0 ax)) avx-vzeroupper-27.c:23 -1
     (expr_list:REG_DEAD (reg:DI 0 ax)
        (nil)))
(insn 34 32 36 2 (use (reg:XF 8 st)) avx-vzeroupper-27.c:23 -1
     (expr_list:REG_DEAD (reg:XF 8 st)
        (nil)))
(insn 36 34 44 2 (use (reg:OI 21 xmm0)) avx-vzeroupper-27.c:23 -1
     (nil))
(insn 44 36 0 2 (use (reg/i:DF 21 xmm0)) avx-vzeroupper-27.c:24 -1

where mode switching correctly placed vzeroupper insn, based on
MODE_EXIT mode, determined from final use in (insn 44).

2012-11-04  Vladimir Yakovlev  <vladimir.b.yakovlev@intel.com>
	    Uros Bizjak  <ubizjak@gmail.com>

        * mode-switching.c (create_pre_exit): Added code for
maybe_builtin_apply case.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu,
with vzeroupper patch [1] applied.

I have added SH4 maintainer for possible comments.

OK for mainline SVN?

[1] http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00292.html

Uros.

--
Index: mode-switching.c
===================================================================
--- mode-switching.c    (revision 193133)
+++ mode-switching.c    (working copy)
@@ -342,6 +342,16 @@ create_pre_exit (int n_entities, int *entity_map,
                      }
                    if (j >= 0)
                      {
+                       /* __builtin_return emits a sequence of loads to all
+                          function value registers in their widest mode,
+                          which breaks the assumption on the mode of the
+                          return register load. Allow this situation, so the
+                          final mode switch will be emitted after the load.  */
+                       if (maybe_builtin_apply
+                           && targetm.calls.function_value_regno_p
+                               (copy_start))
+                         forced_late_switch = 1;
+
                        /* For the SH4, floating point loads depend on fpscr,
                           thus we might need to put the final mode switch
                           after the return value copy.  That is still OK,


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]