开发记录1

正式开发的第一天

将老师给的word文档的内容写进数据，如何写进数据库呢？

首先想到的是用java读取文件，将文件按照一定规则划分，并写入数据库，当然这是一种方法，大二的时候也做过类似的训练。

但是这一次我用的是python（或许是因为最近在学python，想要试一试），众所周知，python大部分用来获取网页上的内容，用来爬去网页标签中的内容。基于这种前提，我先将word文档转换成了html，然后查看网页的元素，写出相应的代码：

from bs4 import BeautifulSoup
import pymysql

conn = pymysql.connect(host='localhost', user='root', passwd='root', db='weather', port=3306, charset='utf8')
cursor = conn.cursor()
# 读取文件
def read_file(path):
    # 注意编码格式可能会出错
    with open(path, 'r+', encoding='ANSI') as f:
        str = f.read()
    return str.strip().replace('ufeff', '')


# 解析
def parse_data(data):
    # 读取第一个MsoToc1和第二个MsoToc1之间的所有数据
    for str1 in data.split('class=MsoToc1')[1:]:
        bs = BeautifulSoup(str1, 'lxml')
        index = 0
        title1 = ""
        title2 = ""
        title3 = ""
        try:
            for tag in bs.select('a'):
                strs = tag.get_text().split(' ')[0].rstrip()
                if ('第' in strs and '篇' in strs):
                    title1 = tag.get_text().split(' ')[1].replace('.', '')

                elif ('第' in strs and '章' in strs):
                    title2 = tag.get_text().split(' ')[1].replace('.', '')
                else:
                    index = strs;
                    title3 = tag.get_text().split(' ')[1].replace('.', '')
                    save(index, title1, title2, title3)
        except:
            print("数据有误，跳过执行")
    bigdiv = data.split('class=WordSection3')[1]
    for str1 in bigdiv.split('class=3132020')[1:]:
        soup = BeautifulSoup('<p class="3132020" ' + str1, 'lxml')
        content = ""
        index = int(soup.find('p', {'class': '3132020'}).get_text().split(' ')[0])
        for tag in soup.find_all('p', {'class': '4'}):
            content += tag.get_text() + '
'
        update(index, content)
    return


# 保存到数据库
def save(index, title1, title2, title3):
    db = pymysql.connect(host='localhost', user='root', passwd='root', db='dazuoye', port=3306, charset='utf8')
    conn = db.cursor()  # 获取指针以操作数据库
    conn.execute('set names utf8')
    t = (int(index), title1, title2, title3)

    sql = "INSERT INTO info_tech values(%d,'%s','%s','%s',' ','','')" % t
    print(sql)
    try:
        conn.execute(sql)
        # 执行sql语句
        db.commit()
    except:
        # 发生错误时回滚
        db.rollback()
    # 关闭数据库连接
    db.close()
    return


# 修改到数据库
def update(index, content):
    db = pymysql.connect(host='localhost', user='root', password='root', db='dazuoye')
    conn = db.cursor()  # 获取指针以操作数据库
    conn.execute('set names utf8')
    t = (content, int(index))
    sql = "update info_tech set content = '%s' where `index` = %d" % t
    print(sql)
    try:
        conn.execute(sql)
        # 执行sql语句
        db.commit()
    except:
        # 发生错误时回滚
        db.rollback()
    # 关闭数据库连接
    db.close()
    return


if __name__ == '__main__':
    str = read_file('information.htm')
    parse_data(str)

由于代码有部分借鉴，由于忘了原文作者及链接了，发表此博客仅是为了以后查询方便，如果原作者看到，请在下面评论，我会及时道歉并附上链接。

相关阅读:
比特币学习笔记（五）---继续解读入口部分源码
 比特币学习笔记（四）---解读入口部分源码
 比特币学习笔记（三）---配置文件和启动
 比特币学习笔记（二）---在windows下调试比特币源码
 比特币学习笔记（一）---在windows下编译搭建比特币环境
 让我们把KBEngine玩坏吧！如何定制我们自己的C++函数（一）
KBEngine warring项目源码阅读（三）实体文件与Account处理
 KBEngine warring项目源码阅读（二）登录和baseapp的负载均衡
 KBEngine warring项目源码阅读（一）项目简介和注册
 JMeter测试TCP服务器遇到的一个奇怪问题
原文地址：https://www.cnblogs.com/lovema1210/p/10646179.html