This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 1/2] Implement Levenshtein distance
- From: Mikael Morin <mikael dot morin at sfr dot fr>
- To: David Malcolm <dmalcolm at redhat dot com>, Jeff Law <law at redhat dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, Manuel López-Ibáñez <lopezibanez at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 2 Nov 2015 11:56:35 +0100
- Subject: Re: [PATCH 1/2] Implement Levenshtein distance
- Authentication-results: sourceware.org; auth=none
- Authentication-results: sfrmc.priv.atos.fr; dkim=none (no signature); dkim-adsp=none (no policy) header dot from=mikael dot morin at sfr dot fr
- References: <55FB150C dot 5090105 at redhat dot com> <1446209267-49800-1-git-send-email-dmalcolm at redhat dot com> <1446209267-49800-2-git-send-email-dmalcolm at redhat dot com>
Hello,
Le 30/10/2015 13:47, David Malcolm a Ãcrit :
This patch adds an implementation of Levenshtein distance to gcc,
along with unit testing of the algorithm.
I noticed two nits while looking at it.
diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c
new file mode 100644
index 0000000..532df58
--- /dev/null
+++ b/gcc/spellcheck.c
@@ -0,0 +1,136 @@
+/* Find near-matches for strings and identifiers.
+ Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "spellcheck.h"
+
+/* The Levenshtein distance is an "edit-distance": the minimal
+ number of one-character insertions, removals or substitutions
+ that are needed to change one string into another.
+
+ This implementation uses the Wagner-Fischer algorithm. */
+
You forgot to explain that m and n are the lengths of s and t
respectively. You may want to just use a more descriptive name for them.
+static edit_distance_t
+levenshtein_distance (const char *s, int m,
+ const char *t, int n)
+{
+ const bool debug = false;
+
+ if (debug)
+ {
+ printf ("s: \"%s\" (m=%i)\n", s, m);
+ printf ("t: \"%s\" (n=%i)\n", t, n);
+ }
+
+ if (m == 0)
+ return n;
+ if (n == 0)
+ return m;
+
+ /* We effectively build a matrix where each (i, j) contains the
+ Levenshtein distance between the prefix strings s[0:i]
+ and t[0:j].
The code seems to use s[0:j] and t[0:i] instead, doesn't it?
Thanks
Mikael