001、
(base) root@PC1:/home/test2# ls a.fasta list.txt test.py (base) root@PC1:/home/test2# head a.fasta ## 基因组fasta文件 >NC_000964.3 Bacillus subtilis subsp. subtilis str. 168 chromosome, complete genome ATCTTTTTCGGCTTTTTTTAGTATCCACAGAGGTTATCGACAACATTTTCACATTACCAACCCCTGTGGACAAGGTTTTT TCAACAGGTTGTCCGCTTTGTGGATAAGATTGTGACAACCATTGCAAGCTCTCGTTTATTTTGGTATTATATTTGTGTTT TAACTCTTGATTACTAATCCTACCTTTCCTCTTTATCCACAAAGTGTGGATAAGTTGTGGATTGATTTCACACAGCTTGT GTAGAAGGTTGTCCACAAGTTGTGAAATTTGTCGAAAAGCTATTTATCTACTATATTATATGTTTTCAACATTTAATGTG TACGAATGGTAAGCGCCATTTGCTCTTTTTTTGTGTTCTATAACAGAGAAAGACGCCATTTTCTAAGAAAAGGAGGGACG TGCCGGAAGATGGAAAATATATTAGACCTGTGGAACCAAGCCCTTGCTCAAATCGAAAAAAAGTTGAGCAAACCGAGTTT TGAGACTTGGATGAAGTCAACCAAAGCCCACTCACTGCAAGGCGATACATTAACAATCACGGCTCCCAATGAATTTGCCA GAGACTGGCTGGAGTCCAGATACTTGCATCTGATTGCAGATACTATATATGAATTAACCGGGGAAGAATTGAGCATTAAG TTTGTCATTCCTCAAAATCAAGATGTTGAGGACTTTATGCCGAAACCGCAAGTCAAAAAAGCGGTCAAAGAAGATACATC (base) root@PC1:/home/test2# cat list.txt ## 基因位置信息 gene46 NC_000964.3 42917 43660 + NP_387934.1 NC_000964.3 59504 60070 + yfmC NC_000964.3 825787 826834 - cds821 NC_000964.3 885844 886173 - (base) root@PC1:/home/test2# cat test.py ## 测试程序 #!/usr/bin/python in_file1 = open("list.txt", "r") in_file2 = open("a.fasta", "r") out_file = open("result.txt", "w") dict1 = dict() dict2 = dict() for i in in_file1: i = i.strip().split() dict1[i[0]] = [i[1], int(i[2]) - 1, int(i[3]), i[4]] for i in in_file2: i = i.strip() if i[0] == ">": key = i.split()[0] dict2[key] = "" else: dict2[key] += i def com_pro(str): dict3 = {"a":"t", "t":"a", "c":"g", "g":"c", "n":"n", "A":"T", "T":"A", "C":"G", "G":"C", "N":"N"} str1 = reversed(str) result_list = [dict3[k] for k in str1] return ("".join(result_list)) for i,j in dict1.items(): print(i, "[" + j[0] , j[1] + 1 , j[2] , j[3] + "]", file = out_file) seq = dict2[">" + j[0]][j[1]:j[2]] if j[3] == "+": print(seq, file = out_file) if j[3] == "-": seq = com_pro(seq) print(seq, file = out_file) in_file1.close() in_file2.close() out_file.close() (base) root@PC1:/home/test2# python test.py ## 运行程序 (base) root@PC1:/home/test2# ls a.fasta list.txt result.txt test.py (base) root@PC1:/home/test2# cat result.txt ## 程序运行结果 gene46 [NC_000964.3 42917 43660 +] ATGGTTTCATTACATGATGATGAAAGATTAGATTATTTGCTGGCAGAGGACATGAAAATCATACAAAGCCCAACAGTGTTTGCTTTTTCGTTGGACGCTGTGCTTCTGTCCAAATTTGCGTACGTTCCGATTCAAAAAGGGAAAATTGTTGATTTATGCACCGGCAATGGTATTGTGCCGCTGCTGCTCAGTACAAGATCAAAAGCAGACATTCTGGGAGTCGAAATTCAAGAAAGACTGCATGATATGGCTGTTCGCAGCGTGGAGTATAATAAGTTGGACGATCAGATCCAGATCATACATGATGACCTGAAAAACATGCCGGAGAAACTTGGACATAATCGATATGATGTTGTCACCTGCAATCCGCCGTATTTTAAAACGCCGAAACAAACTGAACAAAACATGAACGAGCATCTCCGAATCGCAAGACATGAAATCCACTGCACGCTGGAGGATGTCATTTCAGTCAGCAGCAAGCTGCTCAAGCAAGGGGGAAAAGCAGCTCTTGTTCACCGGCCGGGAAGGCTTCTGGAGATTTTTGAACTGATGAAGGCTTATCAAATCGAGCCGAAACGTGTACAATTTGTCTATCCGAAGCAAGGGAAAGAAGCCAATACCATTTTGGTTGAAGGTATCAAAGGCGGGCGCCCGGATTTGAAAATTCTTCCTCCCTTATTCGTATATGATGAACAAAATGAATATACAAAAGAAATCAGGACCATTTTATATGGAGACAAATAA NP_387934.1 [NC_000964.3 59504 60070 +] ATGCTTGTGATTGCCGGTCTCGGAAACCCGGGGAAGAACTATGAAAATACACGGCATAATGTCGGATTTATGGTGATAGATCAGCTTGCAAAGGAATGGAATATAGAGCTGAATCAAAATAAATTTAACGGATTATACGGAACCGGATTTGTTTCCGGCAAAAAGGTTCTACTTGTTAAACCGCTTACATATATGAATTTATCAGGAGAATGTTTGCGGCCTTTAATGGACTACTATGATGTCGATAACGAAGATTTGACAGTCATTTACGACGACCTTGACCTTCCGACTGGCAAGATCCGTTTAAGAACGAAAGGAAGCGCCGGAGGGCACAATGGCATCAAATCACTGATCCAGCATCTTGGAACGTCCGAGTTTGACCGTATCCGCATCGGAATCGGCCGGCCTGTAAACGGCATGAAGGTCGTTGATTATGTGTTAGGCTCCTTTACCAAGGAGGAGGCACCTGAGATCGAAGAAGCGGTTGATAAATCTGTGAAGGCTTGTGAGGCTTCTTTGAGTAAACCGTTTTTAGAAGTCATGAACGAATTTAACGCAAAGGTATAA yfmC [NC_000964.3 825787 826834 -] CTTTCTTTACTAAAAAAATATTGACATGATAAGCCATGCTATTATAGTGTTACATGTGATAATGATTCTCATTACTAAATCTGAAAAAAGGAAGAATGACATGCGCACCTATTCTAACAAGTTGATTGCCATCATGAGTGTTTTATTGCTCGCCTGCCTCATTGTATCCGGCTGTTCATCAAGCCAGAATAACAACGGAAGCGGCAAAAGCGAGTCTAAGGATTCCAGAGTGATCCATGACGAAGAAGGAAAAACGACAGTAAGCGGCACACCTAAGCGGGTGGTTGTGCTTGAGCTTTCATTCTTGGATGCCGTTCACAATCTCGGCATTACGCCGGTGGGCATCGCAGATGACAACAAAAAAGATATGATTAAAAAGCTTGTCGGCAGCTCCATTGATTACACATCTGTAGGCACACGCAGCGAACCCAATCTTGAGGTCATCAGTTCCTTGAAGCCTGATTTAATCATCGCTGACGCTGAGCGCCATAAAAACATTTATAAACAGCTGAAAAAAATCGCCCCGACGATTGAATTAAAAAGCCGTGAAGCGACATATGACGAAACGATCGACAGCTTTACGACCATTGCTAAAGCATTAAATAAAGAAGATGAAGGAAAAGAAAAGCTTGCCGAGCACAAAAAAGTCATCAACGATCTAAAAGCCGAACTTCCGAAAGATGAAAACCGCAACATCGTTCTCGGCGTTGCAAGAGCGGATTCCTTCCAGCTTCATACATCATCATCCTATGACGGAGAAATCTTTAAAATGCTAGGCTTTACACACGCTGTGAAGTCAGATAACGCCTATCAAGAGGTCAGCCTTGAGCAATTGAGCAAAATCGATCCTGATATTTTGTTCATCTCAGCCAACGAAGGCAAAACCATTGTAGATGAGTGGAAAACGAACCCGCTCTGGAAAAATCTCAAAGCGGTGAAAAATGGACAAGTCTATGATGCGGACCGTGACACTTGGACAAGATTCAGAGGCATCAAGTCTAGTGAAACAAGCGCCAAAGATGTGCTTAAAAAAGTGTATAATAAATAG cds821 [NC_000964.3 885844 886173 -] ATGATGCTGATTACCATTCTTTTATTTCTCGCGGCAGGGCTTGCTGAAATTGGCGGCGGATATCTGGTTTGGCTATGGCTGAGAGAGGCAAAGCCAGCTGGCTACGGAATCGCCGGGGCGCTGATCCTCATTGTATACGGCATTCTTCCGACGTTTCAGTCCTTCCCATCTTTCGGCCGTGTATACGCCGCTTATGGCGGAGTATTCATCGTGCTTGCGGTCCTGTGGGGATGGCTTGTTGACCGGAAAACACCTGATCTGTATGACTGGATCGGCGCATTCATTTGTCTCATCGGTGTCTGTGTTATTTTATTTGCGCCGCGCGGATAA
参考:https://mp.weixin.qq.com/s?__biz=MzIxNzc1Mzk3NQ==&mid=2247491504&idx=1&sn=4ac56dfb5cae9cf101b95c64b2585915&chksm=97f5afa8a08226be7ff80e8f85093295d6370dd4f014d2bc67f0302d9c794110709de7a12818&scene=178&cur_album_id=2403674812188688386#rd