综合练习:英文词频统计
下载一首英文的歌词或文章
sing = ''' i'm just a little bit caught in the middle life is a maze and love is a riddle i don't know where to go can't do it alone i've tried but i don't know why slow it down make it stop or else my heart is going to pop cause its to much yea its alot to be something i'm not i'm a fool out of love cause i just can't get enough i'm just a little bit caught in the middle life is a maze and love is a riddle i don't know where to go can't do it alone i've tride but i don't know why i'm just a little girl lost in the moment i'm so scared but i don't show it i can't figure it out it's bringing me down i know i've got to let it go and just enjoy the show the sun is hot in the sky just like a giant spot light the people follow the signs and sicronise in time it's just no body knows they got to take it to the show i'm just a little bit caught in the middle life is a maze and love is a riddle i don't know where to go can't do it alone i've tried but i don't know why i'm just a little girl lost in the moment i'm so scared but i don't show it i can't figure it out it's bringing me down i know i've got to let it go and just enjoy the show just engoy the show i'm just a little bit caught in the middle life is a maze and love is a riddle i don't know where to go can't do it alone i've tride but i don't know why i'm just a little girl lost in the moment i'm so scared but i don't show it i can't figure it out it's bringing me down i know i've got to let it go and just enjoy the show just enjoy the show just enjoy the show i want my money back i want my money back i want my money back just enjoy the show i want my money back i want my money back i want my money back just enjoy the show '''
1.将所有,.?!’:等分隔符全部替换为空格
newSing = sing.replace("'"," ").replace("."," ").replace("?"," ").replace(" "," ") print(newSing)
2.将所有小写转换为大写
newSmall = newSing.upper() print(newSmall)
3.生成单词列表
listWord = newSing.replace("\"," ").split(" ") print(listWord)
4.生成词频统计
DicWord ={} for word in listWord: if word in DicWord.keys(): DicWord[word] +=1 else: DicWord[word] =1 print(DicWord)
5.排序
Dec = sorted(DicWord.keys()) print(Dec)
6.排除语法型词汇,代词、冠词、连词
vocalbuary = ["a","so","the","they","is","in","to","of","i"] for word in vocalbuary: del DicWord[word] print(DicWord)
7.输出词频最大TOP10
NewDicWord = sorted(DicWord.items(),key=lambda item:item[1],reverse=True) print(NewDicWord) for word in range(10): print(NewDicWord[word])
做了许多,很多都是查阅网上的一些资料,其实也是加深自己的一个基础牢固程度吧~