• 查找最相近的字符串


    这是一道codewars上面3级的题

    原题:

    Description:

    I'm sure, you know Google's "Did you mean ...?", when you entered a search term and mistyped a word. In this kata we want to implement something similar.

    You'll get an entered term (lowercase string) and an array of known words (also lowercase strings). Your task is to find out, which word from the dictionary is most similar to the entered one. The similarity is described by the minimum number of letters you have to add, remove or replace in order to get from the entered word to one of the dictionary. The lower the number of required changes, the higher the similarity between each two words.

    Same words are obviously the most similar ones. A word that needs one letter to be changed is more similar to another word that needs 2 (or more) letters to be changed. E.g. the mistyped term berr is more similar to beer (1 letter to be replaced) than to barrel (3 letters to be changed in total).

    Extend the dictionary in a way, that it is able to return you the most similar word from the list of known words.

    Code Examples:

    Dictionary fruits = new Dictionary(new String[]{"cherry", "pineapple", "melon", "strawberry", "raspberry"});
    
    fruits.findMostSimilar("strawbery"); // must return "strawberry"
    fruits.findMostSimilar("berry"); // must return "cherry"
    
    Dictionary things = new Dictionary(new String[]{"stars", "mars", "wars", "codec", "codewars"});
    things.findMostSimilar("coddwars"); // must return "codewars"
    
    Dictionary languages = new Dictionary(new String[]{"javascript", "java", "ruby", "php", "python", "coffeescript"});
    languages.findMostSimilar("heaven"); // must return "java"
    languages.findMostSimilar("javascript"); // must return "javascript" (same words are obviously the most similar ones)

    I know, many of you would disagree that java is more similar to heaven than all the other ones, but in this kata it is ;)

    Additional notes:

    • there is always exactly one possible solution

      这道题的大概意思就是,给你一个字符串数组ss,还有一个字符串s,从这个字符串数组ss中  找出通过最少次数的操作就能变成字符串s的那个字符串.

      什么是通过最少次数的操作?就是两个字符串,每添加或删除或替换一次单个字符,操作数加1,操作数最小的两个字符串就是最相似的字符串

      不明白的请看原题(其实英语没那么难,借助有道笔记看懂题的意思还是不太难的)

      这道题我没有想到什么好的解决方法,只好使用笨办法来解这道题

      (1)我把给出的数组中的每一个字符串依次和目标字符串进行比较,分别计算他们的操作数,然后取操作数最小的那个字符串。(总思路)

      (2)计算操作数的方法,假如有字符串a和字符串b:

          步骤1.依次取a的各个字符,然后依次新建3个字符串,分别是,添加一个b对应位置的字符,删除当前字符,替换当前字符为b对应位置的字符,

               然后把所有的生成的字符串都去和目标字符串比较一次,取出最相似的那个,然后这就算操作数加1

          步骤2.如果上一步生成出来的字符串和目标字符串相等,结束运行。否则,再拿上一步生成出来的最相似的那个字符串继续执行步骤一,直到结束

          记录每次循环的次数,最后得出总的操作数

      (3)如何算是最相似的字符串?

          依次比较两个字符串对应位置的的字符是否相等,相等则加1,不相等则减1,最后数越大相似程度越高

          比如'abc'和'abcd',相似值是3.  'bbc'和'abcd',相似值是2.  'abc'和'qabc',相似值是0,因为对应位置不相等

      我对我做这道题的方法非常不满意,但是我又没有想到更好的方法,这个方法效率极低,时间复杂度是大概是O(n^4).

      这绝对是我写过的时间复杂度最高的代码了,codewars上有很多人都比我的解题方式要好,

      但是由于没有说明,我看了看发现没怎么看懂。暂时没时间,等有时间了一定要重新做一遍这道题

      下面是代码,因为我当时是用的js来做这道题,所以是js代码。应该没太大影响吧,搞java的应该都会js

      

      

    function Dictionary(words) {
      this.words = words;
    }
    
    // 依次取数组中字符串和目标字符串比较,计算所需操作数,返回操作数最小的字符串 Dictionary.prototype.findMostSimilar
    = function(term) { var count = 99999; var result = ""; for(var i = 0;i<this.words.length;i++) { var nextCount = this.getTotal(this.words[i],term); if(nextCount < count) { count = nextCount; result = this.words[i]; } } return result; }
    // 比较两个字符串差距,相同位置字符相同则加1,不相同则减1
    // 最后的值越大,说明两个字符串越相似 Dictionary.prototype.count
    = function(str1,str2) { var countNum = 0, result = 0; while(countNum<str1.length || countNum<str2.length) { if(str1.charAt(countNum) == str2.charAt(countNum)) { result++; }else {     result--;   } countNum++; } return result; } // 对str1字符串中的每一个字符进行依次删除,替换,添加操作,把所有字符串放入数组中,依次计算生成的每个字符串和目标字符串的相似值
    // 取最相似的字符串作为下一次计算的基准字符串
    // 这里比较乱,相当于我把当前字符串所能做的所有的可能性的操作都列举了出来,然后找到这些可能的字符串中和目标字符串最相似的字符串,然后在进行下一次列举 Dictionary.prototype.getNext
    = function(str1,str2) { var countNum = 0, result = "", arr = [];
      // 依次取出每个字符
    while(countNum<str1.length || countNum<str2.length) {   var char_str1 = str1.charAt(countNum);   var char_str2 = str2.charAt(countNum);   var arr_str1 = str1.split("");   var arr_str2 = str2.split("");   //加一个str2中当前字符位置的字符   var clone = arr_str1.slice(0);   clone.splice(countNum,0,char_str2);   arr.push(clone.join(""));   //减去当前字符   var clone = arr_str1.slice(0);   clone.splice(countNum,1);   arr.push(clone.join(""));
         // 如果当前位置字符与目标字符串的当前位置字符不相等,则替换成目标字符   
    if(char_str1 != char_str2) {    //    var clone = arr_str1.slice(0);    clone.splice(countNum,1,char_str2);    arr.push(clone.join(""));   }   countNum++; } countNum = -10000;
     // 对生成的所有字符串计算相似值,取最大的那个作为下一次的基准字符串
    for(var i=0;i<arr.length;i++) {   var count = this.count(arr[i],str2); if(count > countNum) { countNum = count; result = arr[i]; } } return result; }
    // 得到str1变成str2所需的总操作数 Dictionary.prototype.getTotal
    = function(str1,str2) { if(str1 == str2) {    return 0; } var total = 0;
       // 如果str1还没有变成str2,则继续下一次操作
       // total记录进行过多少次操作
    while(str1 != str2) { str1 = this.getNext(str1,str2); total++; } return total; }
  • 相关阅读:
    R语言用神经网络改进Nelson-Siegel模型拟合收益率曲线分析
    用R语言用Nelson Siegel和线性插值模型对债券价格和收益率建模
    R语言LME4混合效应模型研究教师的受欢迎程度
    R语言Black Scholes和Cox-Ross-Rubinstein期权定价模型案例
    R语言中的Nelson-Siegel模型在汇率预测的应用
    R语言中的block Gibbs吉布斯采样贝叶斯多元线性回归
    LNMP搭建
    php高性能开发阅读笔记
    php 关于经纬度距离计算方法
    在已经部署svn 服务器上,搭建svn项目 成功版
  • 原文地址:https://www.cnblogs.com/wsss/p/5483171.html
Copyright © 2020-2023  润新知