- 词频统计预处理
- 下载一首英文的歌词或文章
- 将所有,.?!’:等分隔符全部替换为空格
- 将所有大写转换为小写
- 生成单词列表
- 生成词频统计
- 排序
- 排除语法型词汇,代词、冠词、连词
- 输出词频最大TOP10
代码:
# -*- coding:utf-8 -*- song = ''' Nobody ever knows
Nobody ever sees
I left my soul
Back then no I'm too weak
Most nights I pray for you to come home
Praying to the lord
Praying for my soul
Now please don't go
Most nights I hardly sleep when I'm alone
Now please don't go oh no
I think of you whenever I'm alone
So please don't go
Cause I don't ever wanna know
Don't ever want to see things change
Cause when I'm living on my own
I wanna take it back and start again
Most nights I pray for you to come home
I'm praying to the lord
I'm praying for my soul
Now please don't go
Most nights I hardly sleep
When I'm alone
Now please don't go oh no
I think of you whenever I'm alone
So please don't go
I sent so many messages
You don't reply
Gotta feel around what am I missing babe
Singing now oh oh oh
I need you now I need your love oh
Now please don't go
I said most nights I hardly sleep
When I'm alone
Now please don't go oh no
I think of you whenever I'm alone
So please don't go
So please don't go
So please don't go
Oh no
I think of you whenever I'm alone
So please don't go ''' symbol = list(''',.?!’:"“”-%$''') for i in symbol: song = song.replace(i, ' ') song = song.lower() split = song.split() word = {} for i in split: count = song.count(i) word[i] = count words = ''' a an the in on to at and of is was are were i he she you your they us their our it or for be too do no that s so as but it's '''
prep = words.split() for i in prep: # 判断单词是否在字典中 if i in word.keys(): del(word[i]) word = sorted(word.items(), key=lambda item: item[1], reverse=True) for i in range(10): print(word[i])