30 Jun 2013

Here’s a quick implementation of the Levenshtein distance - a measure of the similiarity of two strings - in Emacs Lisp.

(defun head (string)
  (substring string 0 1))

(defun tail (string)
  (substring string 1))

(defun levenshtein
  (string1 string2)
  (cond ((= 0 (length string1)) (length string2))
        ((= 0 (length string2)) (length string1))
        (t (min (+ (levenshtein (tail string1) string2) 1)
                (+ (levenshtein string1 (tail string2)) 1)
                (+ (levenshtein (tail string1) (tail string2))
                   (if (string-equal (head string1)
                                     (head string2))
                       0
                       1))))))

(list (= 6 (levenshtein "" "Sunday"))
      (= 8 (levenshtein "Saturday" ""))
      (= 3 (levenshtein "Saturday" "Sunday"))
      (= 3 (levenshtein "sitting" "kitten"))
      (= 2 (levenshtein "sitting" "kittin")))

Note: I wrote this for interest’s sake, because I couldn’t sleep. If you need a real implementation, you’d probably be better off with the one from EmacsWiki.

comments powered by Disqus