GCN back-end branch

Martin Jambor mjambor@suse.cz
Thu Mar 16 16:37:00 GMT 2017


after working on GCN back-end in private branch, we would like to make
it public and invite the community to have a look, comment, review or
even contribute.  Therefore we have just pushed the current state of
the back-end to the git branch gcn (see
https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/gcn or
fetch it as any other git branch).

We have decided not to have ChangeLog.gcn files but if you wish to
contribute, please make standard changelog entries part of commit
messages.  Additionally, the basic git collaboration rules should
apply, most notably make sure you do not do non-fast-forward pushes to
the branch, start your commit messages with one-line brief summaries
and so forth.  Any patches against the branch should be sent to
gcc-patches, and while I think that full-blown reviews are not
necessary at this stage, please coordinate with me and Honza before
you commit anything.  I will be making regular merges from trunk.

At this point, the back-end can compile small kernels open-coded in C
with target-specific attributes, built-ins and address spaces to make
use of the various special characteristics of the architecture.
Eventually, it should of course provide for high-level programming
models, most notably OpenMP, but the list of steps we need to take
before we get there is very long.

The changelog of the branch initial commit is below.  Apart from a new
machine description, it also contains a few modifications to the
compiler proper, most of which are needed to increase the limit on
size of scalar types and the number of arguments of an instruction
(which are actually not strictly necessary now but we have bumped into
them during development).  We plan to commit generally useful generic
changes early in stage1.

So far we have tested output of the branch only on AMD APUs, we have
not tested on discrete GPUs yet.  To run the kernels, you need quite a
few more pieces in your software stack in addition to our branch and
the hardware.  Most notably, you currently need:

  1) an AMDGPU-LLVM-based assembler,
  2) the amdphdrs utility from
    https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra, and
  3,4,5) ROCK kernel, ROCT thunk interface library and ROCR run time
    library, which you can get from
    https://github.com/RadeonOpenCompute (or currently from
    if you use openSUSE Tumbleweed, so far I have packaged only
    version 1.3 but so far it was sufficient).

The work-flow is that you configure the branch with
--target=amdgcn-unknown-amdhsa, use it to compile the kernel into
assembly, which you then feed to llvm-mc amdgcn assembler, we then use
amdphdrs tool to convert the resultant object file to an AMD HSA "code
object" which the ROCR run time can then load and execute.  Honza and
I hope to come up with an article demonstrating what can already be
done with the branch soon, but that is clearly out of scope of this
already too long announcement.  We plan to write a wiki page with some
examples and more detailed descriptions of some basic problems with
modeling GCN in GCC.

Thus, let me conclude saying that I'm looking forward to taking on
many challenges this architecture will present for GCC and I would
like to invite everyone interested to help tackling them,


2017-03-10  Jan Hubicka  <jh@suse.cz>
	    Martin Jambor  <mjambor@suse.cz>

	* config.sub: Added amdgcn cases.

	* common/config/gcn/gcn-common.c: New file.
	* config/gcn/constraints.md: Likewise.
	* config/gcn/gcn-builtins.def: Likewise.
	* config/gcn/gcn-c.c: Likewise.
	* config/gcn/gcn-hsa.h: Likewise.
	* config/gcn/gcn-modes.def: Likewise.
	* config/gcn/gcn-protos.h: Likewise.
	* config/gcn/gcn-valu.md: Likewise.
	* config/gcn/gcn.c: Likewise.
	* config/gcn/gcn.h: Likewise.
	* config/gcn/gcn.md: Likewise.
	* config/gcn/gcn.opt: Likewise.
	* config/gcn/predicates.md: Likewise.
	* config/gcn/t-gcn-elf: Likewise.
	* ira.c (ira_init_register_move_cost): Also check that
	* combine.c (gen_lowpart_or_truncate): Return clobber if there is
	not a integer mode if the same size as x.
	(gen_lowpart_for_combine): Fail if there is no integer mode of the
	same size.
	* config.gcc: Added amdgcn cases.
	* emit-rtl.c (get_mem_align_offset): Return zero for overaligned
	* explow.c (memory_address_addr_space): Call memory_address_addr_space
	if a representation by a single register is invalid.
	* expr.c (expand_expr_real_1): disable converting operand to fields or
	BLK mode.
	* ira-costs.c (setup_allocno_class_and_costs): Do not assert that
	cost_classes_ptr->hard_regno_index is non-negative.
	* lra-constraints.c (process_alt_operands): Do not penalize constnats.
	(curr_insn_transform): Allow contants in optional reloads.
	* lra-int.h (lra_static_insn_data): Make n_operands, n_dups and
	n_alternatives unsinged.
	* print-rtl.c (print_rtx_operand_codes_E_and_V): Print how many times
	the same elements are repeated rather than printing all of them.
	* recog.h (recog_data_d): Make dup_num, n_operands, n_dups and
	n_alternatives unsigned.
	(insn_data_d): Make n_generator_args, n_operands, n_dups and
	n_alternatives unsigned.
	* reload1.c (elimination_costs_in_insn): Change size of orig_dup to
	* simplify-rtx.c (simplify_merge_mask): New function.
	(simplify_ternary_operation): Use it, also see if VEC_MERGEs with the
	same masks are used in op1 or op2.

More information about the Gcc mailing list