[PATCH/RFC 0/2] C++ FE: expression ranges (v3)

David Malcolm dmalcolm@redhat.com
Wed Nov 25 20:32:00 GMT 2015


On Sat, 2015-11-21 at 02:16 -0500, Jason Merrill wrote:
> On 11/19/2015 03:46 PM, Jason Merrill wrote:
> > On 11/15/2015 12:01 AM, David Malcolm wrote:
> >> As with the C frontend, there's an issue with tree nodes that
> >> don't have locations: VAR_DECL, INTEGER_CST, etc:
> >>
> >>    int test (int foo)
> >>    {
> >>      return foo * 100;
> >>             ^^^   ^^^
> >>    }
> >>
> >> where we'd like to access the source spelling ranges of the expressions
> >> during parsing, so that we can use them when reporting parser errors.
> >
> > Hmm, I had been thinking to address this in the C++ front end by
> > wrapping uses in another tree: NOP_EXPR for rvalues, VIEW_CONVERT_EXPR
> > for lvalues.

As seen down-thread, I started attempting something like this, with
a new tree code, but Richi said:
> I think at this stage a new tree code isn't appropriate.
which presumably rules that approach out for gcc 6.

I'm still hoping to get expression ranges for C++ into gcc 6, hence
am still experimenting with the cp_expr approach; here is v3 of the
patch - though I appreciate we're well past stage 1.

> On the other hand, my direction seems likely to cause more issues, 
> especially with code that doesn't yet know how to handle 
> VIEW_CONVERT_EXPR, and could create ambiguity with explicit conversions. 
>   So I guess your approach seems reasonable.
> 
> What is the memory consumption impact of this change?

kdecore.cc ggc usage in KB, relative to a control build of r230562
-O0: Mem max: 659141.000 -> 659141.000: no change
-O1: Mem max: 955883.000 -> 955891.000: 1.0000x larger
-O2: Mem max: 1168490.000 -> 1184881.000: 1.0140x larger
-O3: Mem max: 1269955.000 -> 1286526.000: 1.0130x larger
-Os: Mem max: 875913.000 -> 884114.000: 1.0094x larger

It doesn't appear to significantly affect compile-time for that test.

> > Also, in cp_parser_new_expression I attempted to generate meaningful
> > ranges e.g.:
> >
> >   int *foo = new int[100];
> >              ^~~~~~~~~~~~
> >
> > but it seems to be hard to do this, due to the trailing optional
> > components; I found myself wanting to ask the lexer for the last
> > token it consumed (in particular, I'm interested in the
> > location_t of that token).  Is this a reasonable thing to add to the
> > lexer?
> 
> cp_lexer_previous_token seems like what you want.

Thanks.  I've used in it the latest version of the patch.

> > -      return cp_build_unary_op (code, arg1, candidates != 0, complain);
> > +      {
> > +	tree result = cp_build_unary_op (code, arg1, candidates != 0, complain);
> > +	protected_set_expr_location (result, loc);
> 
> I'd much rather pass loc into cp_build_unary_op.  At least add a FIXME 
> to that effect.  Likewise in the other places you call *build* and then 
> set the loc.

I've attempted this.  In the following I've split the patch into two:
an initial patch which adds location_t params to various tree-building
functions in the C++ FE, and then the followup patch to use this.

However, as I note in the patch themself, I'm not sure I've done it
correctly.  For the places where there's no location_t available, it's
never clear to me if I should be using input_location or
UNKNOWN_LOCATION, and there are some changes to where diagnostics
are emitted, which shows up as regressions.  In particular it's not
clear to me how locations are meant to interact with templates (and
I believe that to be the biggest area of risk with the patch).

> > +#if 0
> > +    /* FIXME: various assertions can be put in here when debugging,
> > +       for tracking down where location information gets thrown
> > +       away (during a trip through a purely "tree" value).  */
> > +    if (m_value && m_value != error_mark_node)
> > +      {
> > +	if (TREE_CODE (m_value) == FUNCTION_DECL)
> > +	  return; // for now
> > +	gcc_assert (CAN_HAVE_LOCATION_P (m_value));
> > +	//gcc_assert (m_loc != UNKNOWN_LOCATION);
> 
> Yeah, I don't think you want to assert on UNKNOWN_LOCATION; some code 
> does not and should not have an associated location.  In particular, 
> cleanups.

I've eliminated that #if 0 code in the latest version of the patch.

 
> >  Build the assignment expression.  Its default
> > -		 location is the location of the '=' token.  */
> > +		 location:
> > +		   LHS = RHS
> > +		   ~~~~^~~~~
> > +		 is the location of the '=' token as the
> > +		 caret, ranging from the start of the lhs to the
> > +		 end of the rhs.  */
> >  	      saved_input_location = input_location;
> > +	      loc = make_location (loc,
> > +				   expr.get_start (),
> > +				   rhs.get_finish ());
> >  	      input_location = loc;
> >  	      expr = build_x_modify_expr (loc, expr,
> >  					  assignment_operator,
> >  					  rhs,
> >  					  complain_flags (decltype_p));
> > +	      protected_set_expr_location (expr, loc);
> >  	      input_location = saved_input_location;
> 
> Do we still need to mess with input_location here?  If so, please add a 
> FIXME explaining what still needs to be fixed.

I've eliminated that save/restore of input_location, using the
approach described above.

Thanks.

Dave



More information about the Gcc-patches mailing list