• LeetCode-Repeated DNA Sequence


    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].

    Analysis:

    We want to code a 10-letter-long substring into a integer, to perform hashset add and check for duplication.

    Since each letter only has 4 cases: A,C,G,T, we can use 2-bit to represent it. Therefore, we can use a 20-bits integer to represent the substring.

    Solution:

    public class Solution {
        // Use mask to only maintain the last 20 bits.
        int mask = (1 << 20) - 1;
    
        public List<String> findRepeatedDnaSequences(String s) {
            List<String> resList = new ArrayList<String>();
            if (s.length() < 10)
                return resList;
    
            HashSet<Integer> codeSet = new HashSet<Integer>();
            HashSet<Integer> resSet = new HashSet<Integer>();
            char[] charArray = s.toCharArray();
    
            // Get code of the first 9 letters.
            int code = 0;
            for (int i = 0; i < 9; i++) {
                code = moveCode(code, charArray[i]);
            }
    
            for (int i = 9; i < s.length(); i++) {
                // Get code.
                code = moveCode(code, charArray[i]);
                // if current code has existed and have not appeared twice (i.e.,
                // not added to resList), then add it into resList.
                if (!codeSet.add(code) && resSet.add(code)) {
                    resList.add(s.substring(i - 9, i + 1));
                }
            }
            return resList;
        }
    
        public int moveCode(int value, char c) {
            value <<= 2;
            // if (c=='A') value += 0;
            if (c == 'C')  value += 1;
            if (c == 'G')  value += 2;
            if (c == 'T')  value += 3;
            value &= mask;
            return value;
        }
    
    }
  • 相关阅读:
    前端大牛们都学过哪些东西?
    自适应网页设计
    CSS border实现三角形
    wampServer 设置
    javascript 基础
    Oracle 优化相关
    Java(数据类型)基础面试
    Java 反射机制
    【设计模式】——代理模式
    【设计模式】——单例模式
  • 原文地址:https://www.cnblogs.com/lishiblog/p/5824312.html
Copyright © 2020-2023  润新知