Consider this simple source: struct C { C(int i_) : i(i_) { } int i; }; struct A { A() : i(0) { } template <typename U> explicit operator U(); int i; }; void foo() { A ai; C c = (C)ai; } The conversion operator in A mangles to "_ZN1AcvT_I1CEEv", which demangles to: A::operator C<C>() (I think this would be better as "A::operator C()", but let's leave that aside for now.) That looks OK, but not if you dig deeper. The demangler is parsing this name as follows: typed name qualified name name 'A' cast template template parameter 0 template argument list name 'C' function type argument list A::operator C<C>() It's parsing the "T_" as a <template-template-param>, not a <template-param>. It should really be parsing it as: typed name template qualified name name 'A' cast template parameter 0 template argument list name 'C' function type argument list A::operator C<C>() The template argument list actually belongs with the <qualified-name>, not with the template parameter. The basic problem is an ambiguity in the grammar for mangled names. Consider the following two derivations: (1) <nested-name> -> <template-prefix> <template-args> -> <prefix> <template-unqualified-name> <template-args> -> <unqualified-name> <template-unqualified-name> <template-args> -> <source-name> <template-unqualified-name> <template-args> -> <source-name> <operator-name> <template-args> -> <source-name> cv <type> <template-args> -> <source-name> cv <template-template-param> <template-args> <template-args> (2) <nested-name> -> <template-prefix> <template-args> -> <prefix> <template-unqualified-name> <template-args> -> <unqualified-name> <template-unqualified-name> <template-args> -> <source-name> <template-unqualified-name> <template-args> -> <source-name> <operator-name> <template-args> -> <source-name> cv <type> <template-args> -> <source-name> cv <template-param> <template-args> When you get to the "T_", there's no way to know if it's a <template-param> or a <template-template-param>. The parser in cp-demangle.c, in cplus_demangle_type, looks ahead to disambiguate the two. If it sees an 'I', it assumes that it's a <template-template-param>, and it greedily consumes the <template-args>. In this particular case, it's wrong, but it gets the right answer because of a hack in d_print_cast, which takes the template arguments under the cast operator, and places them in scope before resolving the "T_" reference. It doesn't take much to break that hack, though. Let's throw in a pointer: struct C { C(int i_) : i(i_) { } int i; }; struct A { A() : i(0) { } template <typename U> explicit operator U*(); int i; }; void foo() { A ai; C* c = (C*)ai; } Now we get the mangled name "_ZN1AcvPT_I1CEEv", which the demangler fails on: typed name qualified name name 'A' cast pointer template template parameter 0 template argument list name 'C' function type argument list Failed: _ZN1AcvPT_I1CEEv This name ought to be parsed as follows: typed name template qualified name name 'A' cast pointer template parameter 0 template argument list name 'C' function type argument list A::operator C*<C>() (This incorrect parsing can also lead to some other subtle types of failures, because substitutions can be numbered in the wrong order. See the long example that started me on this investigation at the end.) Now let's add a real template template parameter: template <typename T> struct C { C(T i_) : i(i_) { } T i; }; struct A { A() : i(0) { } template <template<typename U> class V> operator V<int>(); int i; }; void foo() { A ai; C<int> c = (C<int>)ai; } The conversion operator is now "_ZN1AcvT_IiEI1CEEv", and the demangler gives us this: typed name template qualified name name 'A' cast template template parameter 0 template argument list builtin type int template argument list name 'C' function type argument list A::operator int<int><C>() The derivation is actually correct this time, because we really do have a <template-template-param>, but the hack in d_print_cast causes it to substitute the wrong type for "T_". I'm pretty sure that the ambiguity can be resolved by context. Note that if we have a <template-template-param>, the <template-args> that go with it are followed by another set of <template-args> that go with the <template-prefix> in the second step of the derivation. (This is only true if we're parsing a conversion operator that derives from <nested-name>; if it's a conversion operator that's part of an <expression>, the presence of an 'I' should always indicate that we're looking at a <template-template-param>.) I've got a patch that solves this with a little backtracking. If we're in a conversion operator that's not part of an <expression>, and we see an 'I', it checkpoints the d_info structure, does a trial parse of <template-args>, then looks for another 'I'. If there isn't one, it backtracks and returns a <template-param>, leaving the <template-args> for d_prefix to parse. When printing a conversion operator, in d_print_cast, if the operator is derived from a <template>, we need to place that template in scope so that the <template-param> can look up the correct type. We can't just add it to the stack of templates, though, since that's only applicable when printing the type of a function whose name is a template. Here's the long example that started me down this path: _ZNK7strings8internal8SplitterINS_9delimiter5AnyOfENS_9SkipEmptyEEcvT_ISt6vectorI12basic_stringIcSt11char_traitsIcESaIcEESaISD_EEvEEv The demangler fails to parse this string, because the "SD_" near the end is asking for the 15th substitution, but at that point it has only registered 14 substitutions. That happened because it greedily consumed the <template-args> that follow "cvT_", and hadn't yet registered the substitution for the conversion operator (which derives from <prefix>). Any substitutions that are contained in the <template-args> will be off by one as a result.
Created attachment 31253 [details] Proposed patch
Author: ccoutant Date: Fri Nov 22 22:25:49 2013 New Revision: 205292 URL: http://gcc.gnu.org/viewcvs?rev=205292&root=gcc&view=rev Log: Fix demangler to handle conversion operators correctly. libiberty/ PR other/59195 * cp-demangle.c (struct d_info_checkpoint): New struct. (struct d_print_info): Add current_template field. (d_operator_name): Set flag when processing a conversion operator. (cplus_demangle_type): When processing <template-args> for a conversion operator, backtrack if necessary. (d_expression_1): Renamed from d_expression. (d_expression): New wrapper around d_expression_1. (d_checkpoint): New function. (d_backtrack): New function. (d_print_init): Initialize current_template. (d_print_comp): Set current_template. (d_print_cast): Put current_template in scope for printing conversion operator name. (cplus_demangle_init_info): Initialize is_expression and is_conversion. * cp-demangle.h (struct d_info): Add is_expression and is_conversion fields. * testsuite/demangle-expected: New test cases. Modified: trunk/libiberty/ChangeLog trunk/libiberty/cp-demangle.c trunk/libiberty/cp-demangle.h trunk/libiberty/testsuite/demangle-expected
Fixed at r205292.
This caused bug 61321.
Likely bug 61233 too.
Author: trippels Date: Fri Nov 27 14:48:21 2015 New Revision: 231020 URL: https://gcc.gnu.org/viewcvs?rev=231020&root=gcc&view=rev Log: PR other/61321 - demangler crash on casts in template parameters The fix for bug 59195: [C++ demangler handles conversion operator incorrectly] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59195 unfortunately makes the demangler crash due to infinite recursion, in case of casts in template parameters. For example, with: template<int> struct A {}; template <typename Y> void function_temp(A<sizeof ((Y)(999))>) {} template void function_temp<int>(A<sizeof (int)>); The 'function_temp<int>' instantiation above mangles to: _Z13function_tempIiEv1AIXszcvT_Li999EEE The demangler parses this as: typed name template name 'function_temp' template argument list builtin type int function type builtin type void argument list template (*) name 'A' template argument list unary operator operator sizeof unary operator cast template parameter 0 (**) literal builtin type int name '999' And after the fix for 59195, due to: static void d_print_cast (struct d_print_info *dpi, int options, const struct demangle_component *dc) { ... /* For a cast operator, we need the template parameters from the enclosing template in scope for processing the type. */ if (dpi->current_template != NULL) { dpt.next = dpi->templates; dpi->templates = &dpt; dpt.template_decl = dpi->current_template; } when printing the template argument list of A (what should be "<sizeof (int)>"), the template parameter 0 (that is, "T_", the '**' above) now refers to the first parameter of the the template argument list of the 'A' template (the '*' above), exactly what we were already trying to print. This leads to infinite recursion, and stack exaustion. The template parameter 0 should actually refer to the first parameter of the 'function_temp' template. Where it reads "for the cast operator" in the comment in d_print_cast (above), it's really talking about a conversion operator, like: struct A { template <typename U> explicit operator U(); }; We don't want to inject the template parameters from the enclosing template in scope when processing a cast _expression_, only when handling a conversion operator. The problem is that DEMANGLE_COMPONENT_CAST is currently ambiguous, and means _both_ 'conversion operator' and 'cast expression'. Fix this by adding a new DEMANGLE_COMPONENT_CONVERSION component type, which does what DEMANGLE_COMPONENT_CAST does today, and making DEMANGLE_COMPONENT_CAST just simply print its component subtree. I think we could instead reuse DEMANGLE_COMPONENT_CAST and in d_print_comp_inner still do: @@ -5001,9 +5013,9 @@ d_print_comp_inner (struct d_print_info *dpi, int options, d_print_comp (dpi, options, dc->u.s_extended_operator.name); return; case DEMANGLE_COMPONENT_CAST: d_append_string (dpi, "operator "); - d_print_cast (dpi, options, dc); + d_print_conversion (dpi, options, dc); return; leaving the unary cast case below calling d_print_cast, but seems to me that spliting the component types makes it easier to reason about the code. g++'s testsuite actually generates three symbols that crash the demangler in the same way. I've added those as tests in the demangler testsuite as well. And then this fixes PR other/61233 too, which happens to be a demangler crash originally reported to GDB, at: https://sourceware.org/bugzilla/show_bug.cgi?id=16957 Bootstrapped and regtested on x86_64 Fedora 20. Also ran this through GDB's testsuite. GDB will require a small update to use DEMANGLE_COMPONENT_CONVERSION in one place it's using DEMANGLE_COMPONENT_CAST in its sources. libiberty/ 2015-11-27 Pedro Alves <palves@redhat.com> PR other/61321 PR other/61233 * demangle.h (enum demangle_component_type) <DEMANGLE_COMPONENT_CONVERSION>: New value. * cp-demangle.c (d_demangle_callback, d_make_comp): Handle DEMANGLE_COMPONENT_CONVERSION. (is_ctor_dtor_or_conversion): Handle DEMANGLE_COMPONENT_CONVERSION instead of DEMANGLE_COMPONENT_CAST. (d_operator_name): Return a DEMANGLE_COMPONENT_CONVERSION component if handling a conversion. (d_count_templates_scopes, d_print_comp_inner): Handle DEMANGLE_COMPONENT_CONVERSION. (d_print_comp_inner): Handle DEMANGLE_COMPONENT_CONVERSION instead of DEMANGLE_COMPONENT_CAST. (d_print_cast): Rename as ... (d_print_conversion): ... this. Adjust comments. (d_print_cast): Rewrite - simply print the left subcomponent. * cp-demint.c (cplus_demangle_fill_component): Handle DEMANGLE_COMPONENT_CONVERSION. * testsuite/demangle-expected: Add tests. Modified: trunk/include/demangle.h trunk/libiberty/ChangeLog trunk/libiberty/cp-demangle.c trunk/libiberty/cp-demint.c trunk/libiberty/testsuite/demangle-expected