[PATCH] [GCN] Fix handling of VCC_CONDITIONAL_REG

Andrew Stubbs ams@codesourcery.com
Thu Nov 14 13:23:00 GMT 2019


On 14/11/2019 12:43, Kwok Cheung Yeung wrote:
> Hello
> 
> This patch fixes an issue seen in the following test cases on AMD GCN:
> 
> libgomp.oacc-fortran/gemm.f90
> libgomp.oacc-fortran/gemm-2.f90
> libgomp.c/for-5-test_ttdpfs_ds128_auto.c
> libgomp.c/for-5-test_ttdpfs_ds128_guided32.c
> libgomp.c/for-5-test_ttdpfs_ds128_runtime.c
> libgomp.c/for-5-test_ttdpfs_ds128_static.c
> libgomp.c/for-5-test_ttdpfs_ds128_static32.c
> libgomp.c/for-6-test_tdpfs_ds128_auto.c
> libgomp.c/for-6-test_tdpfs_ds128_guided32.c
> libgomp.c/for-6-test_tdpfs_ds128_runtime.c
> libgomp.c/for-6-test_tdpfs_ds128_static.c
> libgomp.c/for-6-test_tdpfs_ds128_static32.c
> libgomp.c-c++-common/for-5.c
> libgomp.c-c++-common/for-6.c
> 
> The compiler is generating code like this:
> 
>      v_cmp_gt_i32    vcc, s3, v1
>      v_writelane_b32    v3, vcc_lo, 0    ; Move VCC to v[3:4]
>      v_writelane_b32    v4, vcc_hi, 0
>      ...
>      v_cmp_eq_f32    vcc, v20, v0    ; Clobber VCC
>      ...
>      ; Move old VCC_LO into s62 - this is okay as v3 has not
>      ; been clobbered.
>      v_readlane_b32    s62, v3, 0
> 
>      ; Move old VCC_HI into s63 - this is _not_ okay as VCC_HI
>      ; has been clobbered by the intervening v_cmp_eq_f32 instruction.
>      s_mov_b32    s63, vcc_hi
> 
> GCC is failing to notice that the v_cmp_eq_f32 instruction clobbers both 
> vcc_lo and vcc_hi. This is because gcn_hard_regno_nregs uses 
> REGNO_REG_CLASS to determine the register class of the queried hard reg, 
> but vcc_lo and vcc_hi are classified as GENERAL_REGS by 
> gcn_regno_reg_class. gcn_hard_regno_nregs therefore returns 1 for the 
> BImode store into vcc, so vcc_hi is regarded as untouched by the operation.
> 
> This is fixed by making gcn_regno_reg_class classify vcc_lo and vcc_hi 
> into VCC_CONDITIONAL_REG (REGNO_REG_CLASS is supposed to return the 
> smallest class anyway). I have also added a spill class for 
> VCC_CONDITIONAL_REG (into SGPR_REGS) to avoid expensive spills into memory.
> 
> Built for and tested on the AMD GCN target with no regressions.
> 
> Okay for trunk?
> 
> Kwok
> 
> 
> 2019-11-14  Kwok Cheung Yeung  <kcy@codesourcery.com>
> 
>      gcc/
>      * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG
>      register class for VCC_LO and VCC_HI.
>      (gcn_spill_class): Use SGPR_REGS to spill registers in
>      VCC_CONDITIONAL_REG.
> ---
>   gcc/config/gcn/gcn.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
> index 976843b..dd89c26 100644
> --- a/gcc/config/gcn/gcn.c
> +++ b/gcc/config/gcn/gcn.c
> @@ -470,6 +470,9 @@ gcn_regno_reg_class (int regno)
>       {
>       case SCC_REG:
>         return SCC_CONDITIONAL_REG;
> +    case VCC_LO_REG:
> +    case VCC_HI_REG:
> +      return VCC_CONDITIONAL_REG;
>       case VCCZ_REG:
>         return VCCZ_CONDITIONAL_REG;
>       case EXECZ_REG:
> @@ -637,7 +640,8 @@ gcn_can_split_p (machine_mode, rtx op)
>   static reg_class_t
>   gcn_spill_class (reg_class_t c, machine_mode /*mode */ )
>   {
> -  if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c))
> +  if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c)
> +      || c == VCC_CONDITIONAL_REG)
>       return SGPR_REGS;
>     else
>       return NO_REGS;

OK.

Andrew



More information about the Gcc-patches mailing list