RFC: Merge the GUPC branch into the GCC 6.0 trunk

Gary Funck gary@intrepid.com
Tue Dec 1 05:31:00 GMT 2015


Some time ago, we submitted an RFC for the introduction of
UPC support into GCC.  During the intervening time period,
we have continued to keep the 'gupc' (GNU UPC) branch in sync
with the GCC trunk and have incorporated feedback and contributions from
various GCC developers (Joseph Myers, Tom Tromey, Jakub Jelinek,
Richard Henderson, Meador Inge, and others).  We have also implemented
various bug fixes and improvements.

At this time, we would like to re-submit the UPC patches for comment
with the goal of introducing these changes into GCC 6.0.

This email provides an overview of UPC and summarizes the
impact of UPC changes on the GCC front-end.

Subsequent emails will include various patch sets which are grouped
by the area of GCC that they impact (front-end, generic, documentation,
build, test, target-specific, and so on), so that they can receive
a more focused review by their respective maintainers.

The main review-related changes are:

* GUPC is no longer implemented as a separate language
(e.g., Objective-C or C++) compiler.  Rather, a new -fupc switch
has been added, which enables UPC support in the C compiler.

* The UPC blocking factor now only uses two of the tree's
"spare" bits.  If the UPC blocking factor is not the default
value of 1 or the "indefinite" value of 0, then it is recorded
in a separate hash table, indexed by the tree node.

* UPC-specific tree support has been integrated into
gcc/c-family/c-common.def and gcc/c-family/c-common.h.

* The number of UPC-specific configuration options
have been reduced.

* The UPC pointer-to-shared format per-target configuration
has been simplified.  Before, both a "packed" and a "struct"
pointer-to-shared representation was supported.  Now, only
the "struct" format is supported and various configuration
options for tweaking field sizes and such have been removed.

* In keeping with current GCC development guidelines
target macros are no longer used.  Rather, where needed,
target hooks are defined and used.

* FIXME's and TODO's were either fixed or cleaned up.

* The copyright and license notices were updated.

* The code was reviewed for conformance to coding standards and updated.

* Diagnostics now use appropriate format strings rather than building
up the strings with sprintf().

* Files in c-family/ no longer include c-tree.h to conform with modularization
improvements.

* Most of the #ifdef conditionals have been removed.  Some target hooks
have been defined and documented in tm.texi.

* The code was reviewed to verify that it conforms with
current GCC coding practices and that it incorporates cleanups
done in the past several years.

* Comments were added to most new functions, and typos and
spelling errors in comments were fixed.

* Changes that appeared in the diff's that were unrelated to UPC
were removed or incorporated into the trunk.

* The linkage to the libgupc library was changed to use the newly
defined method (used in libgomp/libgo for example) of including
library 'spec' files.  This led to a simplification where we no
longer needed to add UPC-specific spec. files in various
target-specific config. directories.

Introduction: UPC-related Changes
---------------------------------

Below, various UPC-related changes are summarized.
This introduction is provided as background for review of the UPC
changes implemented in the GUPC branch.  Each individual change will be
discussed in more detail in the patch sets found in the following emails.

The current GUPC branch is based upon a recent version of the GCC trunk
and has been bootstrapped on x86_64/i686 Linux, x86_64
Darwin, IA64/Altix Linux, PowerPC Power7 (big endian), and Power8
(little endian).  Also some testing has been done on various flavors
of BSD and Solaris and in the past MIPS was tested and supported.

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

In the discussion below, some changes are excerpted in order to
highlight important aspects of the changes.

UPC's Shared Qualifier and Layout Qualifier
-------------------------------------------

The UPC language specification describes
the language syntax and semantics:
  http://upc.lbl.gov/publications/upc-spec-1.3.pdf

UPC introduces a new qualifier, "shared" that indicates that the
qualified object is located in a global shared address space that is
accessible by all UPC threads.  Additional qualifiers ("strict" and
"relaxed") further specify the semantics of accesses to
UPC shared objects.

In UPC, a shared qualified array can optionally specify a "layout
qualifier" that indicates how the shared data is blocked and
distributed across UPC threads.

There are two language pre-defined identifiers that indicate the
number of threads that will be created when the program starts
(THREADS) and the current (zero-based) thread number (MYTHREAD).
Typically, a UPC thread is implemented as an operating system process,
though they may be mapped to pthreads, when compiled with the
-fupc-pthreads-model-tls switch.

Access to UPC shared memory may be implemented locally via OS provided
facilities (for example, mmap), or across nodes via a high speed
network inter-connect (for example, Infiniband).

GUPC provides a runtime (libgupc) that targets an SMP-based system
that uses mmap() to implement global shared memory.

Optionally, GUPC can use the more general and more capable Berkeley
UPCR runtime:
  http://upc.lbl.gov/download/source.shtml#runtime
The UPCR runtime supports a number of network
topologies, and has been ported to most of the
current High Performance Computing (HPC) systems.

The following example illustrates the use of the UPC "shared" qualifier
combined with a layout qualifier.

    #define BLKSIZE 5
    #define N_PER_THREAD (4 * BLKSIZE)
    shared [BLKSIZE] double A[N_PER_THREAD*THREADS];

Above the "[BLKSIZE]" construct is the UPC layout factor; this
specifies that the shared array, A, distributes its elements across
each thread in blocks of 5 elements.  If the program is run with two
threads, then A is distributed as shown below:

    Thread 0	Thread 1
    --------	---------
    A[ 0.. 4]	A[ 5.. 9]
    A[10..14]	A[15..19]
    A[20..24]	A[25..29]
    A[30..34]	A[35..39]

The elements shown for thread 0 are defined as having "affinity"
to thread 0.  Similarly, those elements shown for thread 1 have
affinity to thread 1.  In UPC, a pointer to a shared object can be
cast to a thread local pointer (a "C" pointer), when the designated
shared object has affinity to the referencing thread.

A UPC "pointer-to-shared" (PTS) is a pointer that references a UPC
shared object.  A UPC pointer-to-shared is a "fat" pointer with the
following logical fields:
   (virt_addr, thread, phase)

The virtual address (virt_addr) field is combined with the thread
number (thread) to derive the location of the referenced object
within the UPC shared address space.  The phase field is used
keep track of the current block offset for PTS's that have
blocking factor that is greater than one.

GUPC implements pointer-to-shared objects using a "struct" representation.
Until recently, GUPC also supported a "packed" representation, which
is more space efficient, but limits the range of various fields
in the UPC pointer-to-shared representation.  We have decided to
support only the "struct" representation so that the compiler uses
a single ABI that supports the full range of addresses, threads,
and blocking factors.

GCC's internal tree representation is extended to record the UPC
"shared", "strict", "relaxed" qualifiers, and the layout qualifier.

--- gcc/tree-core.h     (.../trunk)     (revision 228959)
+++ gcc/tree-core.h     (.../branches/gupc)     (revision 229159)
@@ -470,7 +470,11 @@ enum cv_qualifier {
   TYPE_QUAL_CONST    = 0x1,
   TYPE_QUAL_VOLATILE = 0x2,
   TYPE_QUAL_RESTRICT = 0x4,
-  TYPE_QUAL_ATOMIC   = 0x8
+  TYPE_QUAL_ATOMIC   = 0x8,
+  /* UPC qualifiers */
+  TYPE_QUAL_SHARED   = 0x10,
+  TYPE_QUAL_RELAXED  = 0x20,
+  TYPE_QUAL_STRICT   = 0x40
 };
[...]
@@ -857,9 +875,14 @@ struct GTY(()) tree_base {
       unsigned user_align : 1;
       unsigned nameless_flag : 1;
       unsigned atomic_flag : 1;
-      unsigned spare0 : 3;
-
-      unsigned spare1 : 8;
+      unsigned shared_flag : 1;
+      unsigned strict_flag : 1;
+      unsigned relaxed_flag : 1;
+
+      unsigned threads_factor_flag : 1;
+      unsigned block_factor_0 : 1;
+      unsigned block_factor_x : 1;
+      unsigned spare1 : 5;

UPC defines a few additional tree node types:

--- gcc/c-family/c-common.def   (.../trunk)     (revision 228959)
+++ gcc/c-family/c-common.def   (.../branches/gupc)     (revision 229159)
@@ -62,6 +62,24 @@ DEFTREECODE (SIZEOF_EXPR, "sizeof_expr",
    Operand 3 is the stride.  */
 DEFTREECODE (ARRAY_NOTATION_REF, "array_notation_ref", tcc_reference, 4)

+/* Used to represent a `upc_forall' statement. The operands are
+   UPC_FORALL_INIT_STMT, UPC_FORALL_COND, UPC_FORALL_EXPR,
+   UPC_FORALL_BODY, and UPC_FORALL_AFFINITY respectively. */
+
+DEFTREECODE (UPC_FORALL_STMT, "upc_forall_stmt", tcc_statement, 5)
+
+/* Used to represent a UPC synchronization statement. The first
+   operand is the synchronization operation, UPC_SYNC_OP:
+   UPC_SYNC_NOTIFY_OP  1       Notify operation
+   UPC_SYNC_WAIT_OP    2       Wait operation
+   UPC_SYNC_BARRIER_OP 3       Barrier operation
+
+   The second operand, UPC_SYNC_ID is the (optional) expression
+   whose value specifies the barrier identifier which is checked
+   by the various synchronization operations. */
+
+DEFTREECODE (UPC_SYNC_STMT, "upc_sync_stmt", tcc_statement, 2)
+

The "C" parser is extended to recognize UPC's syntactic extensions.

--- gcc/c-family/c-common.c     (.../trunk)     (revision 228959)
+++ gcc/c-family/c-common.c     (.../branches/gupc)     (revision 229159)
@@ -412,8 +426,9 @@ static int resort_field_decl_cmp (const
    C --std=c89: D_C99 | D_CXXONLY | D_OBJC | D_CXX_OBJC
    C --std=c99: D_CXXONLY | D_OBJC
    ObjC is like C except that D_OBJC and D_CXX_OBJC are not set
-   C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC
-   C++ --std=c0x: D_CONLY | D_OBJC
+   UPC is like C except that D_UPC is not set
+   C++ --std=c98: D_CONLY | D_CXXOX | D_OBJC | D_UPC
+   C++ --std=c0x: D_CONLY | D_OBJC | D_UPC
    ObjC++ is like C++ except that D_OBJC is not set
[...]
@@ -629,6 +644,19 @@ const struct c_common_resword c_common_r
   { "inout",           RID_INOUT,              D_OBJC },
   { "oneway",          RID_ONEWAY,             D_OBJC },
   { "out",             RID_OUT,                D_OBJC },
+
+  /* UPC keywords */
+  { "shared",          RID_SHARED,             D_UPC },
+  { "relaxed",         RID_RELAXED,            D_UPC },
+  { "strict",          RID_STRICT,             D_UPC },
+  { "upc_barrier",     RID_UPC_BARRIER,        D_UPC },
+  { "upc_blocksizeof", RID_UPC_BLOCKSIZEOF,    D_UPC },
+  { "upc_elemsizeof",  RID_UPC_ELEMSIZEOF,     D_UPC },
+  { "upc_forall",      RID_UPC_FORALL,         D_UPC },
+  { "upc_localsizeof", RID_UPC_LOCALSIZEOF,    D_UPC },
+  { "upc_notify",      RID_UPC_NOTIFY,         D_UPC },
+  { "upc_wait",                RID_UPC_WAIT,           D_UPC },
+

--- gcc/c/c-parser.c    (.../trunk)     (revision 228959)
+++ gcc/c/c-parser.c    (.../branches/gupc)     (revision 229159)
[...]
+/* These UPC parser functions are only ever called when
+   compiling UPC.  */
+static void c_parser_upc_forall_statement (c_parser *);
+static void c_parser_upc_sync_statement (c_parser *, int);
+static void c_parser_upc_shared_qual (source_location,
+                                      c_parser *,
+                                     struct c_declspecs *);
+
[...]
+        /* UPC qualifiers */
+       case RID_SHARED:
+         attrs_ok = true;
+         c_parser_upc_shared_qual (loc, parser, specs);
+         break;
+       case RID_STRICT:
+       case RID_RELAXED:
+         attrs_ok = true;
+         declspecs_add_qual (loc, specs, c_parser_peek_token (parser)->value);
+         c_parser_consume_token (parser);
+         break;
[...]
+  /* Process all #pragma's just after the opening brace.  This
+     handles #pragma upc, which can only appear just after
+     the opening brace, when it appears within a function body.  */
+  push_upc_consistency_mode ();
+  permit_pragma_upc ();
+  while (c_parser_next_token_is (parser, CPP_PRAGMA))
+    {
+      location_t loc ATTRIBUTE_UNUSED = c_parser_peek_token
(parser)->location;
+      if (c_parser_pragma (parser, pragma_compound))
+        last_label = false, last_stmt = true;
+      parser->error = false;
+    }
+  deny_pragma_upc ();
[...]
+       case RID_UPC_FORALL:
+          gcc_assert (flag_upc);
+         c_parser_upc_forall_statement (parser);
+         break;
+        case RID_UPC_NOTIFY:
+          gcc_assert (flag_upc);
+         c_parser_upc_sync_statement (parser, UPC_SYNC_NOTIFY_OP);
+         goto expect_semicolon;
+        case RID_UPC_WAIT:
+          gcc_assert (flag_upc);
+         c_parser_upc_sync_statement (parser, UPC_SYNC_WAIT_OP);
+         goto expect_semicolon;
+        case RID_UPC_BARRIER:
+          gcc_assert (flag_upc);
+         c_parser_upc_sync_statement (parser, UPC_SYNC_BARRIER_OP);
+         goto expect_semicolon;
[...]
        case RID_SIZEOF:
          return c_parser_sizeof_expression (parser);
+       case RID_UPC_BLOCKSIZEOF:
+       case RID_UPC_ELEMSIZEOF:
+       case RID_UPC_LOCALSIZEOF:
+          gcc_assert (flag_upc);
+         return c_parser_sizeof_expression (parser);
[...]

--- gcc/c-family/c-pragma.c     (.../trunk)     (revision 228959)
+++ gcc/c-family/c-pragma.c     (.../branches/gupc)     (revision 229159)
[...]
+/*
+ *  #pragma upc strict
+ *  #pragma upc relaxed
+ *  #pragma upc upc_code
+ *  #pragma upc c_code
+ */
+static void
+handle_pragma_upc (cpp_reader * ARG_UNUSED (dummy))
+{
[...]

c-decl.c handles the additional UPC qualifiers and declspecs.
The layout qualifier is handled here:

--- gcc/c/c-decl.c      (.../trunk)     (revision 228959)
+++ gcc/c/c-decl.c      (.../branches/gupc)     (revision 229159)
[...]
+  /* A UPC layout qualifier is encoded as an ARRAY_REF,
+     further, it implies the presence of the 'shared' keyword. */
+  if (TREE_CODE (qual) == ARRAY_REF)
+    {
+      if (specs->upc_layout_qualifier)
+        {
+          error ("two or more layout qualifiers specified");
+          return specs;
+        }
+      else
+        {
+          specs->upc_layout_qualifier = qual;
+          qual = ridpointers[RID_SHARED];
+        }
+    }

In UPC, a qualifier includes both the traditional
"C" qualifier flags and the UPC "layout qualifier".
Thus, the pointer_quals field of a declarator node
is defined as a struct including both qualifier
flags and the UPC type qualifier, as shown below.

            /* Process type qualifiers (such as const or volatile)
               that were given inside the `*'.  */
-           type_quals = declarator->u.pointer_quals;
+           type_quals = declarator->u.pointer.quals;
+           upc_layout_qualifier = declarator->u.pointer.upc_layout_qual;
+           sharedp = ((type_quals & TYPE_QUAL_SHARED) != 0);

UPC shared variables are allocated at runtime in the global memory
that is allocated and managed by the UPC runtime.  A separate link
section is used as a method of assigning virtual addresses to UPC
shared variables.  The UPC shared variable section is designated as a
"no load" section on systems that support that facility; in that case,
the linkage section begins at virtual address zero.  The logic below
assigns UPC shared variables to their own linkage section.

+    /* Shared variables are given their own link section on
+       most target platforms, and if compiling in pthreads mode
+       regular local file scope variables are made thread local. */
+    if ((TREE_CODE(decl) == VAR_DECL)
+        && !threadp && (TREE_SHARED (decl) || flag_upc_pthreads))
+      upc_set_decl_section (decl);
+

Patches
-------

The patches are organized into the following categories
and will be sent out as separate email messages.

[UPC 01/22] front-end changes
[UPC 02/22] tree-related changes
[UPC 03/22] options processing, driver
[UPC 04/22] Make, Config changes
[UPC 05/22] language hooks changes
[UPC 06/22] target hooks
[UPC 07/22] lowering, pointer-to-shared ops
[UPC 08/22] target - Darwin
[UPC 09/22] target - x86
[UPC 10/22] target - rs6000
[UPC 11/22] documentation
[UPC 12/22] DWARF support
[UPC 13/22] C++ changes
[UPC 14/22] constant folding changes
[UPC 15/22] RTL changes
[UPC 16/22] gimple/gimplify changes
[UPC 17/22] misc/common changes
[UPC 18/22] libatomic changes
[UPC 19/22] libgupc - Make, Configure
[UPC 20/22] libgupc runtime library
[UPC 21/22] gcc.dg test suite
[UPC 22/22] libgupc test suite

thanks,
- Gary



More information about the Gcc-patches mailing list