pdfplumber模块初始用

import pdfplumber 
import re
def pdf_read():
    pdf=pdfplumber.open('文件路径'")#文件路径,读取文件
    page0=pdf.pages[11] #指定页数
    tables=page0.extract_tables()#获得该页的表格
    texts=page0.extract_text()#获得text文本值

pdfplumber 缺省通过表格线来区分行和列，所以下列情况是无法提取出表格的：
* 你的表格是图片，通过选择可以确定是否图片
* 你的表格不是用线来分隔，或者分隔不全，例如列用线，行没线
这种情况下，你就需要尝试：
page0.extract_tables(table_settings={})

相关阅读:
Maximum of lines in a DataBand
"New page after" by code
How to show out three rows from the same databand On A4?
Asp.Net Core 第07局：路由
Asp.Net Core 第06局：中间件
Asp.Net Core 第05局：读取配置
Asp.Net Core 第04局：依赖注入
POJ-1003
ORACLE 存储过程实例 [备忘录]
关于操作有符号数的溢出问题

原文地址：https://www.cnblogs.com/98WDJ/p/11283012.html