Levenshtein distance

0001-01-01

Levenshtein distance is an algorithm used to measure the difference between two strings that calculates the minimum number of single-character edits needed to transform one string to another.

Example:

totallylegit.com totally1egit.com

These two strings have a Levenshtein distance of 1, due to the homoglyph obfuscation of “l” and “1”

Levenshtein distance can be used to detect homoglyph attacks, spell checking, plagiarism detection, and other natural langage processing applications.


Links to this note