• 最喜欢的算法(们)


    String Matching: Levenshtein distance

    • Purpose: to use as little effort to convert one string into the other
    • Intuition behind the method: replacement, addition or deletion of a charcter in a string
    • Steps

    Step

    Description

    1

    Set n to be the length of s.

    Set m to be the length of t.

    If n = 0, return m and exit.

    If m = 0, return n and exit.

    Construct a matrix containing 0..m rows and 0..n columns.

    2

    Initialize the first row to 0..n.

    Initialize the first column to 0..m.

    3

    Examine each character of s (i from 1 to n).

    4

    Examine each character of t (j from 1 to m).

    5

    If s[i] equals t[j], the cost is 0.

    If s[i] doesn't equal t[j], the cost is 1.

    6

    Set cell d[i,j] of the matrix equal to the minimum of:

    a. The cell immediately above plus 1: d[i-1,j] + 1.

    b. The cell immediately to the left plus 1: d[i,j-1] + 1.

    c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.

    7

    After the iteration steps (3, 4, 5, 6) are complete, the distance is found in cell d[n,m].

    • Example

    This section shows how the Levenshtein distance is computed when the source string is "GUMBO" and the target string is "GAMBOL".

    Steps 1 and 2

        G U M B O
      0 1 2 3 4 5
    G 1          
    A 2          
    M 3          
    B 4          
    O 5          
    L 6          

    Steps 3 to 6 When i = 1

        G U M B O
      0 1 2 3 4 5
    G 1 0        
    A 2 1        
    M 3 2        
    B 4 3        
    O 5 4        
    L 6 5        

    Steps 3 to 6 When i = 2

        G U M B O
      0 1 2 3 4 5
    G 1 0 1      
    A 2 1 1      
    M 3 2 2      
    B 4 3 3      
    O 5 4 4      
    L 6 5 5      

    Steps 3 to 6 When i = 3

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2    
    A 2 1 1 2    
    M 3 2 2 1    
    B 4 3 3 2    
    O 5 4 4 3    
    L 6 5 5 4    

    Steps 3 to 6 When i = 4

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2 3  
    A 2 1 1 2 3  
    M 3 2 2 1 2  
    B 4 3 3 2 1  
    O 5 4 4 3 2  
    L 6 5 5 4 3  

    Steps 3 to 6 When i = 5

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2 3 4
    A 2 1 1 2 3 4
    M 3 2 2 1 2 3
    B 4 3 3 2 1 2
    O 5 4 4 3 2 1
    L 6 5 5 4 3 2

    Step 7

    The distance is in the lower right hand corner of the matrix, i.e. 2. This corresponds to our intuitive realization that "GUMBO" can be transformed into "GAMBOL" by substituting "A" for "U" and adding "L" (one substitution and 1 insertion = 2 changes).

     

  • 相关阅读:
    百度地图 android SDKv2.2.0
    由于代码已经过优化或者本机框架位于调用堆栈之上,无法计算表达式的值。System.Threading.ThreadAbortException
    jquery.autocomplete 搜索文字提示
    【444】Data Analysis (shp, arcpy)
    【442】Remote control GUP Linux
    【441】JSON format
    【440】Tweet 元素意义
    Spark(八)JVM调优以及GC垃圾收集器
    Spark(七)Spark内存调优
    Spark(六)Spark之开发调优以及资源调优
  • 原文地址:https://www.cnblogs.com/postmodernist/p/5177424.html
Copyright © 2020-2023  润新知