[Bug other/69968] New: RFC: Use Damerau-Levenshtein within spellcheck.c, rather than Levenshtein
dmalcolm at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Feb 25 22:21:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69968
Bug ID: 69968
Summary: RFC: Use Damerau-Levenshtein within spellcheck.c,
rather than Levenshtein
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: dmalcolm at gcc dot gnu.org
Target Milestone: ---
(quoting Steven Bosscher)
----------------------------------
$ cat t.c
void foo (void);
struct {
int coordx, coordy, coordz;
int coordx1, coordy1, coordz1;
} c;
void foo (void)
{
c.coordx1 = c.coordy1* c.coordz;
c.coorzd1 = c.coordy;
}
$ ./cc1 -quiet -Wall -Wextra t.c
t.c: In function 'foo':
t.c:11:4: error: 'struct <anonymous>' has no member named 'coorzd1';
did you mean 'coordx'?
c.coorzd1 = c.coordy;
^
----------------------------------
Note that z and d are swapped. The Levenshtein metric returns "coordx"
as the best match, but it requires 2 insertions and one deletion to go
from "coorzd1" to "coordx", or to "coordz"/"coordy" -- "coordx" is just the
first of 3 with the same Levenshtein distance.
With Damerau-Levenshtein, we'd be able to recognize the (apparently
most common) mistake of having 2 characters swapped. The
Damerau-Levenshtein distance to "coordz1" is 1 and there is no field in
the struct with a smaller distance.
More information about the Gcc-bugs
mailing list