[PATCH] Refactor rust-demangle to be independent of C++ demangling.

Jakub Jelinek jakub@redhat.com
Wed Oct 23 17:19:00 GMT 2019


On Wed, Oct 23, 2019 at 11:37:26AM -0500, Segher Boessenkool wrote:
> On Wed, Oct 23, 2019 at 07:22:47PM +0300, Alexander Monakov wrote:
> > On Wed, 23 Oct 2019, Eduard-Mihai Burtescu wrote:
> > > @@ -384,6 +384,14 @@ rust_demangle_callback (const char *mangled, int options,
> > >          return 0;
> > >        rdm.sym_len--;
> > >  
> > > +      /* Legacy Rust symbols also always end with a path segment
> > > +         that encodes a 16 hex digit hash, i.e. '17h[a-f0-9]{16}'.
> > > +         This early check, before any parse_ident calls, should
> > > +         quickly filter out most C++ symbols unrelated to Rust. */
> > > +      if (!(rdm.sym_len > 19
> > > +            && !strncmp (&rdm.sym[rdm.sym_len - 19], "17h", 3)))
> > 
> > This can be further optimized by using memcmp in place of strncmp, since from
> > the length check you know that you won't see the null terminator among the three
> > chars you're checking.
> > 
> > The compiler can expand memcmp(buf, "abc", 3) inline as two comparisons against
> > a 16-bit immediate and an 8-bit immediate.  It can't do the same for strncmp.
> 
> The compiler does not currently do that, but it *could*.  Or why not?  The
> compiler is always allowed to load 3 characters here, whether some string
> has a NUL character earlier or not.

It is valid to call strncmp (mmap(...)+page_size-1, "abc", 3), the reading
of the string should stop when 0 is seen.
Of course, it might be that there is a strlen call visible and the strlen
pass could figure out that rdm.sym_len contains the strlen, but maybe it
isn't visible or there is some call in between that might in theory
invalidate it.

	Jakub



More information about the Gcc-patches mailing list