This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[OG9, amdgcn, committed] Use GFX9 granulated sgprs count correctly
- From: Andrew Stubbs <ams at codesourcery dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 10 Sep 2019 12:38:19 +0100
- Subject: [OG9, amdgcn, committed] Use GFX9 granulated sgprs count correctly
- Ironport-sdr: QhUKRUMq3DuidIyiXA1cpwYPa79CyXLeLSupztOiFWHAYtq9e3wP5COluHFyZcYWPP6eCJXEf4 98VeCsF1SMA44LP/fx9q3sgOznWpg34T6FJMqQnqjlTcn1JRUO7vc3FF8MEhgf8Fh2Otjtj/+X Qx+At2g7Z749nBBAxbC6S47q+p9YZc8TkI4emlBCN6maypFWPoT9j/lH6kIkknHtWoduUTsFam hHiV1c59jsRMoJ2vXjzPVJxEUDYYv/du/4RNUfUkCJKcAoSWIjVlfUf3AkraspQWFfnlWUpTjA c/c=
- Ironport-sdr: iTR/9ZAarK8DCPSuHbqxF6+GCVCr9iDStRd9umivCPkOA1r1N2ONiF/ul9rJfc33Y/gpb11F21 p7IYtR3a7GP15vWTDe1moTZ05fMzPdH/eFJ67cbgd/vaf7LnF2l7GMOXHHrsndS66HbNPUQECc /aIBc13/QvYGoob13ul3UIYLJQPxsKCYNYzrakqa/wB8EB9A5hhl8DdsSswKS4SsEFvygVerll A7YxMC4M4Op0an1oIa9CKDuIoRjUC8okopuEmcx9H8om7JDJ7nSuBSS482/7D9XlpNhMFkXRH3 gvw=
This patches adjusts the "granulated sgpr count" kernel settings for
GFX9 devices.
I followed the description I found here:
http://llvm.org/docs/AMDGPUUsage.html
Basically, GFX9 allocates in blocks of 16, not 8, so there was some
danger of requesting too many registers, which would hurt performance.
Andrew
Use GFX9 granulated sgprs count correctly.
2019-09-10 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate
granulated_sgprs according to architecture.
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 66854b6f9c5..f8434e4a4f1 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -4884,6 +4884,14 @@ gcn_hsa_declare_function_name (FILE *file, const char *name, tree)
sgpr = 102 - extra_regs;
}
+ /* GFX8 allocates SGPRs in blocks of 8.
+ GFX9 uses blocks of 16. */
+ int granulated_sgprs;
+ if (TARGET_GCN3)
+ granulated_sgprs = (sgpr + extra_regs + 7) / 8 - 1;
+ else if (TARGET_GCN5)
+ granulated_sgprs = 2 * ((sgpr + extra_regs + 15) / 16 - 1);
+
fputs ("\t.align\t256\n", file);
fputs ("\t.type\t", file);
assemble_name (file, name);
@@ -4922,7 +4930,7 @@ gcn_hsa_declare_function_name (FILE *file, const char *name, tree)
"\t\tcompute_pgm_rsrc2_excp_en = 0\n",
(vgpr - 1) / 4,
/* Must match wavefront_sgpr_count */
- (sgpr + extra_regs + 7) / 8 - 1,
+ granulated_sgprs,
/* The total number of SGPR user data registers requested. This
number must match the number of user data registers enabled. */
cfun->machine->args.nsgprs);