• PDFBox 解析PDF文件-解析服务器文件


    1.首先引进pom

    <!-- PDF读取依赖 -->
    <dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.4</version>
    </dependency>

    2.controller层直接代码

    /**
    * PDF解析
    * @return
    */
    @PostMapping("/getPdf")
    public StringBuffer getPdf(@RequestBody JSONObject jsonObject) throws IOException {
    StringBuffer stringBuffer = null;

    //获取服务器地址
    ImportParams params = new ImportParams();
    params.setSaveUrl("/file");
    String filePath = jsonObject.getString("filePath");
    filePath = fileServer + "/" + filePath;
    URL url = new URL(filePath);
    URLConnection connection = url.openConnection();
    InputStream inputStream = connection.getInputStream();
    try {
    PDDocument document;
    PDFParser parser = new PDFParser(new RandomAccessBuffer(inputStream));
    parser.parse();
    document = parser.getPDDocument();
    document.getClass();
    if(!document.isEncrypted()) {
    PDFTextStripperByArea stripper = new PDFTextStripperByArea();
    stripper.setSortByPosition(true);
    PDFTextStripper textStripper = new PDFTextStripper();
    String exposeContent = textStripper.getText(document);
    String[] content = exposeContent.split("\n");
    stringBuffer = new StringBuffer();
    for(String line:content) {
    stringBuffer.append(line);
    }
    }

    } catch (Exception e) {
    e.printStackTrace();

    }
    return stringBuffer;
    }
  • 相关阅读:
    LOL 计蒜客
    cf1486 D. Max Median
    P3567 [POI2014]KUR-Couriers
    dp 求物品组合情况
    黑暗爆炸
    hdu5306 Gorgeous Sequence
    P4609 [FJOI2016]建筑师
    cf 1342 E. Placing Rooks
    重修dp-背包
    acwing 2154. 梦幻布丁
  • 原文地址:https://www.cnblogs.com/shxkey/p/12427472.html
Copyright © 2020-2023  润新知