[PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

Richard Biener richard.guenther@gmail.com
Thu Sep 17 08:46:00 GMT 2015


On Wed, Sep 16, 2015 at 5:45 PM, Manuel López-Ibáñez
<lopezibanez@gmail.com> wrote:
> On 16 September 2015 at 15:33, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Sep 16, 2015 at 3:22 PM, Michael Matz <matz@suse.de> wrote:
>>>> if we suggest 'foo' instead of foz then we'll get a more confusing followup
>>>> error if we actually use it.
>>>
>>> This particular case could be solved by ruling out candidaten of the wrong
>>> kind (here, something that can be assigned to, vs. a function).  But it
>>> might actually be too early in parsing to say that there will be an
>>> assignment.  I don't think _this_ problem should block the patch.
>
> Indeed. The patch by David does not try to fix-up the code, it merely
> suggests a possible candidate. The follow-up errors should be the same
> before and after. Such suggestions will never be 100% right, even if
> the suggestion makes the code compile and run, it may still be the
> wrong one. A wrong suggestion is far less serious than a wrong
> uninitialized or Warray-bounds warning and we can live with those. Why
> this needs to be perfect from the very beginning?
>
> BTW, there is a PR for this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52277
>
>> I wonder if we can tentatively parse with the choice at hand, only allowing
>> (and even suggesting?) it if that works out.
>
> This would require to queue the error, fix-up the wrong name and
> continue parsing. If there is another error, ignore that one and emit
> the original error without suggestion. The problem here is that we do
> not know if the additional error is actually caused by the fix-up we
> did or it is an already existing error. It would be equally terrible
> to emit errors caused by the fix-up or emit just a single error for
> the typo. We would need to roll-back the tentative parse and do a
> definitive parse anyway. This does not seem possible at the moment
> because the parsers maintain a lot of global state that is not easy to
> roll-back. We cannot simply create a copy of the parser state and
> throw it away later to continue as if the tentative parse has not
> happened.
>
> I'm not even sure if, in general, one can stop at the statement level
> or we would need to parse the whole function (or translation unit) to
> be able to tell if the suggestion is a valid candidate.

I was suggesting to only tentatively finish parsing the "current construct".
No idea how to best figure that out to the extend to make the tentative
parse useful.  Say, if we have "a + s.foz" and the field foz is not there
but foo is, so if we continue parsing with 'foo' instead but 'foo' will have
a type that makes "a + s.foo" invalid then we probably shouldn't suggest
it.  It _might_ be reasonably "easy" to implement that, but I'm not sure.
There might be a field named fz (with same or bigger levenstein distance)
with the correct type.  Of course it might have been I misspelled
's' and meant 'r' instead which has a field foz of corect type... (and 's'
is available as well).

I agree that we don't have to solve all this in the first iteration.

Richard.

> Cheers,
>
> Manuel.



More information about the Gcc-patches mailing list