• 《三国演义》人物出场次数词云统计


    【领域文章来源】:

    通过百度查找的《三国演义》,下载,在本地自己创建txt文件。注:用encoding=utf-8

    【源代码】:

    import jieba
    excludes = {"来到","人马","领兵","将军","却说","荆州","二人","不可","不能","如此"}
    txt = open("threekingdom.txt", "rb").read()
    words = jieba.lcut(txt)
    counts = {}
    for word in words:
    if len(word) == 1:
    continue
    elif word == "诸葛亮" or word == "孔明曰":
    rword = "孔明"
    elif word == "关公" or word == "云长":
    rword = "关羽"
    elif word == "玄德" or word == "玄德曰":
    rword = "刘备"
    elif word == "孟德" or word == "丞相":
    rword = "曹操"
    else:
    rword = word
    counts[rword] = counts.get(rword,0) + 1
    for word in excludes:
    del(counts[word])
    items = list(counts.items())
    items.sort(key=lambda x:x[1], reverse=True)
    for i in range(55):
    word, count = items[i]
    print ("{0:<10}{1:>5}".format(word, count))

    下面是输出的内容:

    词云制作:

     import jieba
    import wordcloud

    f = open("threekingdom.txt","rb")
    t = f.read()
    f.close()
    ls = jieba.lcut(t)
    txt = " ".join(ls)
    w = wordcloud.WordCloud( font_path = "NotoSerifCJK-Bold.ttc",
    width = 1000,height = 700,background_color = "white",
    )

    w.generate(txt)
    w.to_file("gr.png")

    效果如下:

    然后我说一下在制作过程中的问题:

    一开始最大的问题就是各种库的安装,自己真的是费了九牛二虎之力,花了好几天也没搞明白,后来一问同学,有的问题才迎刃而解。(特别感谢李拓和柴易晨同学)!!!

    其他不足之处还请教员,同学们指正,谢谢大家!

  • 相关阅读:
    经典背景音乐集(转)
    商业模式的思考
    PHP5.4的变化关注What has changed in PHP 5.4.x
    yii模版中的写法
    设计模式(一)工厂模式Factory(创建型)
    yii模版中的判断方法
    Yacc 与 Lex 快速入门(词法分析和语法分析)
    Windows PHP 中 VC6 X86 和 VC9 X86 的区别及 Non Thread Safe 的意思
    金融系列1《借贷记卡介绍》
    设计模式概论
  • 原文地址:https://www.cnblogs.com/jxt123/p/12674328.html
Copyright © 2020-2023  润新知