[LLVMdev] Re: gcc like attributes and annotations

Sat Feb 25 17:01:00 GMT 2006

This is a interesting thread.

I think this would also help with compiling scripting languages such
as JavaScript/Python etc. We could keep the high level meta data and
runtime binding info as language specific bytecode in the file and
just have the parts that are easy to represent as compileable in the
main object sections. There is no intrinsic reason for all the runtime
type information to get compiled into the core object module.  Also I
could bypass code thats difficult to compile and just stuff its
bytcode into this section. So I think this really helps with partial
compliation and supporting languags that have complex runtimes.
 The llvm bycode section would just get a stub runtime upcall for code
that not compiled.

For java for example this would probably be the compiled parts with
stubs and a regular classfile for the runtime data with compiled
functions converted to native.

In the short term I think I'll simply use the class file format in my
native compiled classes
and wait and see how this turns out. I've been stuck thinking about
this for two months.

Thanks for the ideas.

Mike

On 2/25/06, Jakob Praher <jp@hapra.at> wrote:
> Hi Reid,
>
> Reid Spencer schrieb:
> > I have some thoughts on this too ..
> >
> Great!
>
> > On Fri, 2006-02-24 at 19:56 +0100, Jakob Praher wrote:
> >
> >>I get you 100 % here. But as you say later in the mail, many information
> >>is done by some runtime std::map<Value*,foo> stuff. Which is really
> >>handy at runtime, but I *had* serialization in mind when I was thinking
> >>about Annotations. I see annotations as a way to serialize some extra
> >>information with the bytecode without having to extend/change the core
> >>classes. The best way to implemented in runtime is to use some kind of
> >>std::map subscripting, plus the additional benefit that you can
> >>serialize it to the bytecode. Perhaps the best of both worlds.
> >>
> ...
> >
> > As Chris mentioned, I would prefer that we keep annotations out of the
> > core IR altogether as they are fraught with problems that are not easy
> > to resolve. However, I understand where you're coming from in wanting to
> > keep additional information with the bytecode. I have wanted the same
> > thing for use by front end or specialized tools. For example an IDE that
> > could keep track of source information or a language that needs special
> > passes that can only be done at link time.
> >
> Yes.
>
> > In thinking about the "right" way to do this, I came up with the idea of
> > a single "blob" of data that could be appended to a Module. This single
> > "annotation" would always be ignored by LLVM, would not require
> > significant additional space to construct, and there is already a
> > mechanism for constructing the information via the bytecode reader's
> > handler interface (might need some extension).
> >
> As far as locality is concerned, perhaps it would make sense to make
> such a blob on every primary object (module,function), so that
> annotations that only apply to a certain function can be stored directly
> in the function. That would make certain collisions easier to resolve.
>
> > This is simply a way of making that std::map of information embeddable
> > in the bytecode. It means the information is stored in one additional
> > bytecode block (at the end) where it doesn't have any impact on LLVM
> > (JIT/storage/etc).  The only question is: how do multiple tools avoid
> > collision in this approach. Some kind of registry or partitioning of the
> > data could likely solve that.
> >
> Yes that sounds like a doable approach. But I would not write any binary
> data into the blob, but use a LLVM type encoding approach/table
> approach. Many annotations are simple or can be composite simple types
> and people should be encouraged to store data in a way, that makes it
> possible to read it without library code. If you just serialize C++
> structs, you end up relying heavy on the code that wrote it. Which makes
> it harder for tools to introspect anntoations. Java's annotations rely
> on simple types for the same principle and I think it is the right way
> for most things. There could be an opaque type for more complex
> information, which should be discouraged.
>
> This would also make it possible to have tripple of
> Value,AnnotationType,Name to match the Annotation, which helps to the
> solve the collision problem too.
>
> The lookup mechanism could lookup by anything of the tripple:
> - Target Value
> - AnnotationType
> - Name
>
> NULL values are wildcards.
>
> So you could say:
>
> Give me all annotations for a Value*
>
> /// Function local annotations
> Value* v = ...
> vector< const Annotation *>  &ans = curFunction->lookupAnnotation( v,
> NULL, NULL);
>
> Or based on a specific type:
>
> /// Module wide annoations
> AnnotationType *type = ...
> Value< const Annotation *> &ans = module->lookupAnnotation( v, type, NULL );
>
> This just random thought though.
>
> >
> >>>As a historical curiosity, Function still needs to be annotatable due to
> >>>the LLVM code generator relying on it.  This will be fixed in LLVM 1.8
> >>>and Function will not be annotable anymore.
> >>>
> >>>If you *really* just want per-pass local data, you should just use an
> >>>std::map from the Value* to your data.
> >>
> >>Why not see Annotations as the means to serialize these Maps. Maybe we
> >>could add an Annotations table that maps Value types to ConstantPool
> >>entries or something like that. This would make it more easily for LLVM
> >>libraries in other languages too.
> >
> >
> > This is similar to my idea above, but I wouldn't want to restrict it to
> > any particular data structure. The application can construct the data
> > however it wishes and simply pass a pointer to a block of memory to the
> > bytecode writer.
> >
>
> Great that we have a similar view. I would use a public simple type
> encoding for the annotations, So that annotations are introspectable
> without knowing much on the details of the annotation data. This helps
> to keep the bytecode free from language specific data encoding too.
>
> -- Jakob
>
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>