• 187. Repeated DNA Sequences (String; Bit)


     All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

    Return:
    ["AAAAACCCCC", "CCCCCAAAAA"].

     思路I:遍历string,每次截取10个字符,判断出现次数。

    Result: Time Limit Exceeded

    思路II:字符数较少=>用数字表示字符=>用bitmap来表示字符串,好处:节省空间

    比如本题只可能出现4种字符=>可表示为0,1,2,3,即可以用2bits来表示=>字符原本一个字符占1 byte = 8 bits,现在只要2 bits

    class Solution {
    public:
         int getVal(char ch) {
            if (ch == 'A') return 0;
            if (ch == 'C') return 1;
            if (ch == 'G') return 2;
            if (ch == 'T') return 3;
        }
        
        vector<string> findRepeatedDnaSequences(string s) {
            int sLen = s.length();
            unsigned int val=0;
            char mp[1024*1024]={0};
            vector<string> ret;
            string str;
            
            if(sLen < 10) return ret;
            
            for(int i = 0; i < 9; i++){
                val <<=2;
                val |= getVal(s[i]);
            }
            
            for(int i = 9; i < sLen; i++){
                val <<= 2;
                val |= getVal(s[i]);
                val &= 0xFFFFF;
                if(++mp[val] == 2){
                    str = s.substr(i-9,10);
                    ret.push_back(str);
                }
            }
    
            return ret;
        }
    };
  • 相关阅读:
    正则表达式学习1
    SELECT INTO 和 INSERT INTO SELECT 两种表复制语句
    (转)DBUS基础知识
    WakeLock的使用
    观察者模式
    Notification的使用
    Head First 设计模式之入门
    (转)Android WebView总结
    书架也是一根筋
    PendingIntent分析
  • 原文地址:https://www.cnblogs.com/qionglouyuyu/p/5047362.html
Copyright © 2020-2023  润新知