This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] First steps towards segregating types.


I've been trying to sort out how to proceed with the gimple_type work, and the first step always come back to figuring out all the places types are used. This has turned out to be non-trivial and is difficult to do in an iterative way. I believe I've found a reasonable way to proceed.

Over the next few months I plan to maintain a branch (tree-type) which leaves types still implemented as trees, and introduce 2 new typedefs and a few macros:

typedef union tree_node *tree_type_ptr;  // same as tree
typedef const union tree_node *const_tree_type_ptr;  // same as const_tree

I will introduce their use throughout the compiler where types are needed. This will "tag" all the type locations and still allow me to bootstrap and run tests to ensure things are still working.

meanwhile, I'll also maintain another patchset which can be applied to this branch and will switch those types to a completely separate type structure not connected to trees. It changes all the TYPE_ accessor macros to be incompatible with trees. This causes compilation errors everywhere a type is referenced, passed, used, or whatever. It is likely to pick up a few extra things along the way related to separating types that are not appropriate for the main branch.

I can then go through the source files fixing the compilation issues raised by adding tree_type_ptr where appropriate and modifying whatever else is required to deal with a segregated type (there is no shortage of those!). These changes can then be applied to the main branch, and tested with a bootstrap/testrun/target-build cycle. I'll also try to keep the branch relatively current with mainline.

Once the entire compiler has been processed, the next hunk of work would involve removing the types from the tree union and a multitude of related cleanups (I'm tracking a list) . The 3 type structs would be replaced with a single type node and tree_type_ptr can be replaced with a pointer to the new type_node. const_tree_type_ptr can also be replaced with a normal const version of the same pointer.. we will *not* be stuck with the const_tree paradigm. It is just needed to enable compatibility with const_tree for now :-P

There are a few issues, of course :-)

The biggest issue is what to do with fields which can be either a type or a tree... ie TREE_VALUE() of a TREE_LIST can be a type, as can a TREE_VEC element or a DECL_CONTEXT. I think the DECL_INITIAL field is overloaded and can sometimes be a type, and this was recently introduced to TARGET_STATIC_CHAIN. I suspect the compilation process will identify others.

Looking primarily at TREE_LIST first (which can be a mixed list of trees and types), the question is how to generally handle this situation

I have 2 workable approaches in mind, but am open to suggestions.

1 - introduce a TYPE_REF tree node, which is effectively just a 'typed' tree node, and the TREE_TYPE() field of a TYPE_REF node would point to the type node. Any routines which utilize a TYPE node in a tree list would have to be modified to make use of this new TYPE_REF node to refer to the type.

2 - change the field (list->value in this case) to be a tagged union of { tree tree_value, tree_type_ptr type_value } and use a bit in the base to flag which kind of value it is. This would be compatible with GTY, and would require changing routines and algorithms to check the bit and use the right field.

Option 2 also introduces a change in current practice. TREE_VALUE() can be either an rvalue or an lvalue right now. This would no longer be possible and would require changing to a get_value(), set_value(), and value_ptr() model. There would be a tree variant and a type variant, along with asserts to make sure they are being used properly. These algorithmic changes can also be fully tested on the main branch. I've implemented this change, and it impacts 40 files which utilize TREE_VALUE as an lvalue. The upside of this is we at least have the illusion of more control. I think the union could possibly be macrod/templated to be generally applicable in other circumstances.

I'm not 100% sure, but I think the TYPE_REF approach could continue with the current lvalue or rvalue approach, perhaps with some tweaking... All conjecture since I haven't prototyped it. It also provides a general mechanism for referencing a type node in any tree circumstance. I have a feeling this is the easiest approach, and lends itself well to an initial implementation. At the moment I'm leaning this way.... but I'm going to think about it over the weekend. Perhaps prototyping it next week will give me a stronger feeling one way or the other.

I also suspect it will be worth introducing a TYPE_VEC node which parallels the TREE_VEC, only giving us a list of types. There may be places that a TREE_LIST is comprised entirely of types, and I'd consider trying to convert those to a TYPE_VEC.

I've attached 2 patches for anyone interested in taking a peek or commenting. The first is the one(prime) which would form the basis of the branch and introduce tree_type_ptr, a few new accessor macros, and enough changes to allow tree.h to be safely compiled in both environments. The second one (offset) is the one which can be applied after the first and provides a segregated type node for finding everywhere a type is used in the compiler.

Comments on the patches or the best way to deal with dual use fields like TREE_VALUE are welcome.

Andrew

Attachment: prime.patch.gz
Description: application/gzip

Attachment: offset.patch.gz
Description: application/gzip


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]