This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 08/27] New file: gcc/jit/libgccjit.h

From: David Malcolm <dmalcolm at redhat dot com>
To: Jeff Law <law at redhat dot com>
Cc: gcc-patches at gcc dot gnu dot org, jit at gcc dot gnu dot org
Date: Mon, 03 Nov 2014 20:21:52 -0500
Subject: Re: [PATCH 08/27] New file: gcc/jit/libgccjit.h
Authentication-results: sourceware.org; auth=none
References: <1414774977-25605-1-git-send-email-dmalcolm at redhat dot com> <1414774977-25605-9-git-send-email-dmalcolm at redhat dot com> <5457E406 dot 2040907 at redhat dot com>

On Mon, 2014-11-03 at 13:22 -0700, Jeff Law wrote:
> On 10/31/14 11:02, David Malcolm wrote:
> > This header is the public API for the library.
> >
> > gcc/jit/
> > 	* libgccjit.h: New.
> Given this is inside the JIT subdirectory, I'm not doing a depth review.
> 
>   +
> > +/* A gcc_jit_block encapsulates a "basic block" of statements within a
> > +   function (i.e. with one entry point and one exit point).
> > +
> > +   Every block within a function must be terminated with a conditional,
> > +   a branch, or a return.
> ? That doesn't seem right.   We don't really place restrictions on what 
> ends a block, but we do place restrictions on what kinds of IL 
> statements can appear in the middle of a block.

Adding a conditional, branch or return to a gcc_jit_block terminates
that block, so you can't add further statements to it.  Hence the API
*does* restrict those statements from being in the middle of a block.

> > +
> > +   All of the blocks in a function must be reachable via some path from
> > +   the first block.
> ?  Is this something your code requires?  While we have some code which 
> assumes unreachable blocks do not exist, we generally deal with that by 
> running the cfgcleanup passes which will identify and remove the 
> unreachables.

An earlier version of the API was much more freeform, without blocks,
instead having labels, requiring the client code to add statements in
order.  It initially seemed like a good idea, but my experience porting
the GNU Octave JIT from llvm was that as a user I needed warnings from
the API about messed-up control flow: it's just too easy to add a
statement to the wrong block, or to use the wrong block for a jump
target.  Once I added these restrictions (and validation checks), it
became dramatically easier to do the port.

The requirement above that blocks are explicitly terminated also came
from this experience: I felt that having the ability for control flow to
implicitly "fall off the end of a block" was more of a liability for
library users than a benefit; better to require them to explicitly state
where control flow should go.

Bear in mind that the intended user of this library is a language
implementer, and they typically not writing code to build a specific
function; they're writing code that walks some other IR (e.g. bytecode)
and uses it to build another function. 
This gives them three levels of implementation: 
  A) the high-level function (e.g. Octave),
  B) the IR for that function (e.g. Octave AST), and 
  C) libgccjit API's version of it
Figuring out where something is going wrong is "fun" to debug.  Hence
the more validation checks we can build in, the easier it will be for
them.   That's my thinking, anyway :)

> And this raises one of those questions that's been in the back of my 
> mind.  What's the right level of documentation and exposure of 
> internals.  When I read the docs, one of questions I kept to myself was 
> whether or not we've giving the users too much or too little 
> information.  As well as a vague concern that actually using the JIT is 
> going to be so painful due to exposure of implementation details that we 
> might want to just go in with the expectation that this is really a V0 
> implementation and that it's all going to have to change and be 
> rewritten as GCC's internals get sorted out.

Hmmm...  I guess we'll have to see.

> > +
> > +/*
> > +   Acquire a JIT-compilation context.
> > +
> > +   FIXME: error-handling?
> There's a whole class of problems with error handling. GCC has always 
> had this notation that it can terminate compilation when something "bad" 
> happens.  In a JIT world that may not be appropriate.  But that's 
> probably outside the scope of what we want to try and tackle at this stage.
Right.  FWIW I consider it a bug if a shared library ever aborts the
process it's in, which is at odds with how much of libiberty and gcc are
written, so changing that is a long-term thing.

The FIXME above relates specifically to the gcc_jit_context_acquire
entrypoint: the rest of the API handles errors by reporting them on a
context, but clearly we can't do that until we have a context.  Though
it can only fail due to memory failure, and given the amount of other
code that aborts on memory failure, it may be futile trapping that.

> > +enum gcc_jit_int_option
> > +{
> > +  /* How much to optimize the code.
> > +     Valid values are 0-3, corresponding to GCC's command-line options
> > +     -O0 through -O3.
> > +
> > +     The default value is 0 (unoptimized).  */
> > +  GCC_JIT_INT_OPTION_OPTIMIZATION_LEVEL,
> > +
> > +  GCC_JIT_NUM_INT_OPTIONS
> > +};
> I don't think we explicitly disallow optimization values > 3, they just 
> don't do anything.
> 
> 
> > +
> > +/* Options taking boolean values.
> > +   These all default to "false".  */
> > +enum gcc_jit_bool_option
> > +{
> > +  /* If true, gcc_jit_context_compile will attempt to do the right
> > +     thing so that if you attach a debugger to the process, it will
> > +     be able to inspect variables and step through your code.
> > +
> > +     Note that you can't step through code unless you set up source
> > +     location information for the code (by creating and passing in
> > +     gcc_jit_location instances).  */
> > +  GCC_JIT_BOOL_OPTION_DEBUGINFO,
> The comment makes me ask, why not always have this on and have 
> gcc_jit_context_compile try to do the right thing? :-)

Maybe this should be reworded from
  "to do the right thing" 
to
  "to do the extra work needed"
?

There's a compile-time expense to tracking and generating the debuginfo,
and typically you don't need it (you might not even have source
locations).  Though right now the profile is dominated by invoking the
driver to assembler and link the generated assembler.

> > +
> > +  /* If true, gcc_jit_context_compile will dump its initial "tree"
> > +     representation of your code to stderr (before any
> > +     optimizations).  */
> > +  GCC_JIT_BOOL_OPTION_DUMP_INITIAL_TREE,
> Is stderr really the best place for the debugging dumps?

I guess we could have the user supply a FILE *, or a filename?

> > +
> > +/* Populating the fields of a formerly-opaque struct type.
> > +   This can only be called once on a given struct type.  */
> > +extern void
> > +gcc_jit_struct_set_fields (gcc_jit_struct *struct_type,
> > +			   gcc_jit_location *loc,
> > +			   int num_fields,
> > +			   gcc_jit_field **fields);
> What happens if you call it more than once?

Then an error will be emitted on the context, and attempts to compile
will fail.

> Is the once only property 
> something we ought to be enforcing, or is that really an internal issue?

I've attempted to do a lot of error-checking at the API boundary, since
otherwise you typically get an ICE deep inside the compiler code, for
the JIT there isn't a driver to provide an error message, it's a crash
in the user's process.

The jit looks like a frontend to the rest of the compiler code, and that
analogy is useful here: part of the job of a gcc frontend is to ensure a
basic level of sanity for the IR that it's supplying to the rest of the
compiler.

If you can change the fields of a struct, then you could have a function
in which some statements are accessing struct foo with one set of a
fields, and later statements access it with another set of fields.  This
is likely to ICE the compiler.

> > +
> > +enum gcc_jit_function_kind
> [ ... ]
> So do we have any use for alias, thunks, etc here?

Not yet supported.

> And WRT types (and perhaps other stuff), there's a pretty direct mapping 
> between the rest of GCC and the JIT stuff.  I worry a bit that someone 
> changing the core of GCC may not know they need to make analogous 
> changes to the JIT bits.  It's a general concern, no need to do anything 
> about it right now.  If it breaks you get to fix it ;-)

(nods)

> [ ... ]
> 
> In your code to build up the contents of blocks, would it make sense to 
> "finalize" the block or somesuch concept after adding a  statement which 
> terminates the block to ensure someone doesn't append statements after 
> the block terminitating statement?

Attempts to add statements to a block after a terminating statement on
it lead of an error being emitted on the context; see
  jit.dg/test-error-adding-to-terminated-block.c
for a testcase of this.

> Patch is OK -- the issues noted above are more things I think are worth 
> discussing and possibly making changes in the future.  Nothing above is 
> significant enough today to warrant making changes in the codebase IMHO.

Thanks
Dave

References:
- Re: [PATCH 08/27] New file: gcc/jit/libgccjit.h
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]