• Edit Distance


    Introduce

    Through the following, u will know what's edit distance.

    target : you are cute  and  I         love u.

    source:   I   am cute          I don't love u.

    For the two sentences, u can use edit including insert, delete and substitute the source so that it can be same to the target.

    u will subsutitute "you" for "I", which will cost 2

    subsutitue "are" for "am", which will cost 2

    insert "and", which will cost 1

    delete "don't", which will cost 1

    so, totally, u will cost 2 + 2 + 1 + 1 = 6, which is the edit distance.

    How to calculate the minimum edit distance

    Now, we take another example

    target : s t o p

    source: s o t

    u can insert "t", which will cost 1, to get "stot", then u can substitute "p" for "t", which will cost 2, to get "stop".

    Besides, u can substitue "t" for "o" to get "stt", then u can substitute "o" for "t" to get "sto", last, u can insert "p" to get "stop"

    Alright, u can have plenty of method to edit to get the target, so how to get the minimum edit distance. We will introduce the algorithm as following, which is dynamic programming.

    i : the length of the target

    j : the length of the source

    D(0, 0) = 0

    D(i, 0) = insertCost * i, means the length of the target is i and the length of the source is 0, so u should insert i characters

    D(0, j) = deleteCost * j, means the length of the target is 0 and the length of the source is j, so u should delete j characters

    D(i, j) = min {D(i - 1, j) + insertCost(target[i]);

                        D(i - 1, j - 1) + substituteCost(source[j], target[i]);

                        D(i, j - 1) + deleteCost(source[j])} means u should choose the method which will cost least.substituteCost = {0, 2}, if source[j] is equal to target[i], it's 0, or it's 2.

    insertCost = 1

    deleteCost = 1

    Well, maybe u will feel confuse, but no problem, We will use the algorithm to solve the above question.

    firstly, u can gain the belew form.

     

    D(1, 1) = min {D[0, 1] + insert(t[1]) = 2;

                          D[0, 0] + substitute(s[1], t[1]) = 0;

            D[1, 0] + delete(s[1]) = 2}          =    0

    D[1, 2] = min{ D[0, 2] + insert(t[1]) = 3;

            D[0, 1] + substitute(s[2], t[1]) = 3;

            D[1, 1] + delete(s[2] = 1)}            =   1

    ................

    then u will get the last number

    D[4, 3] = min {D[3, 2] + insert(t[4]) = 3;

            D[3, 2] + substitute(s[3], t[4]) = 3;

            D[4, 2] + delete(s[3]) = 3}    =    3

    Actually, u can get the path how to get edit through every step. The below picture show the results.

  • 相关阅读:
    Windows系统下安装运行Kafka
    Git强制拉取覆盖本地
    Windows Server 2008/2012 计划任务配置执行bat
    Task.Run()任务执行
    Oracle数据重复,只取一条
    Oracle数据库查看表空间是否为自增的
    How to disable Microsoft Compatibility Telemetry
    最新版谷歌浏览器的Flash设置已经不能保存了?
    亲子之间,在于看懂,无关耐心zz
    读懂孩子有方法
  • 原文地址:https://www.cnblogs.com/chuanlong/p/2993774.html
Copyright © 2020-2023  润新知