• 我的tesseract学习记录(二)


    前言:花了约三周看文档(打酱油),又花了两周搭环境,终于把tesseract用起来了,对简体中文的识别率还不错,在95%以上。现在简要记录一下安装、识别过程。

    一、系统环境

      系统:centos6.5

      编译环境:g++

      依赖软件:leptonica、opencv2.4.9、tesseract3.02

    二、安装过程

    (1) leptonica

      sudo yum -y install autoconf automake libtool
      sudo yum -y install autoconf-archive
      sudo yum -y install pkgconfig
      sudo yum -y install libpng12-dev
      sudo yum -y install libjpeg8-dev
      sudo yum -y install libtiff5-dev
      sudo yum -y install zlib1g-dev

      wget http://www.leptonica.org/source/leptonica-1.68.tar.gz
      tar xvzf leptonica-1.68.tar.gz
      cd leptonica-1.68/
      ./configure
      make && make install

      
      该步骤在普华操作系统中,make时报了一个错误
      pngio.c:119: error: ‘Z_DEFAULT_COMPRESSION’ undeclared here (not in a function)
      解决办法如下:

         去wiki上搜了一把发现是 pngio.c这个文件有个BUG,在MAC下无法找到zlib1g包修改Leptionica/src/pngio.c在  #include "png.h"后插入一下代码即可。

      C代码  收藏代码
    1. #ifdef HAVE_LIBZ  
    2. #include "zlib.h"  
    3. #endif  
     参考的这里

    (2) tesseract3.02

      tesseract的安装参考这里

      同时参考官网这里

      wget https://github.com/tesseract-ocr/tesseract/archive/3.02.02.zip
      unzip 3.
    02.02.zip
      cd tesseract-3.02.02/
      ./autogen.sh
      ./configure --enable-debug LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include"
      make   make install
      $ echo "/usr/local//lib" >> /etc/ld.so.conf(普华操作系统)
      ldconfig

      语言文件:

      export TESSDATA_PREFIX=/some/path/to/tessdata
    

        to point to your tessdata directory (example: if your tessdata path is '/usr/local/share/tessdata' you have to use 'export TESSDATA_PREFIX='/usr/local/share/').

        环境变量TESSDATA_PREFIX的路径需要设置成为tessdata文件夹的父目录。

    (3) opencv2.4.9

      安装cmake(用最新版本)

      cmake版本传送cmake3.4.0二进制版本 

      配置过程:

      $ tar -xvf cmake-3.4.0-rc3-Linux-x86_64.tar.gz

      $ echo 'export PATH=$PATH:/usr/local/cmake-3.4.0-rc3-Linux-x86_64/bin' >> /etc/profile

      $ source  /etc/profile

      参考:http://www.cnblogs.com/Crysaty/p/6247505.html

      重要:先安装gtk+2.x,在编译安装opencv.详细原因和安装方法参考这里

      安装方法也可以参考这里

    1. $ sudo yum -y install gtk2-devel tbb-devel libpng-devel
    2. $ wget http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.9/opencv-2.4.9.zip   
    3. $ unzip opencv-2.4.9.zip   
    4. $ cd opencv-2.4.9  
    5. $ mkdir build   
    6. $ cd build   
    7. $ cmake  -D  CMAKE_BUILD_TYPE=RELEASE  -D  CMAKE_INSTALL_PREFIX=/usr/local  ..
    8. $ make  -j2
    9. $ make install  
    10. $ echo "/usr/local/opencv-2.4.9/build/lib" >> /etc/ld.so.conf(普华操作系统)
    11.  ldconfig(普华操作系统)

      如果下载opencv2.4.10,则是:wget http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.10/opencv-2.4.10.zip   

    三、API接口应用过程

    (1) 编译过程

      1、设置PKG_CONFIG_PATH environment variable ,加入`tesseract.pc'

         $echo 'export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig' >> ~/.bashrc

         $source ~/.bashrc

      2、提示opencv缺libcufft,libnpps,libnppi,libnppc,libcudart等几个库,可以参考这里

        这些库在cuda/lib64中,建立软连接

        [root@localhost lib64]#ln -s /usr/local/cuda-6.5/lib64/libcufft.so.6.5 /usr/local/lib/libcufft.so

        [root@localhost lib64]#ln -s /usr/local/cuda-6.5/lib64/libnpps.so.6.5 /usr/local/lib/libnpps.so
        [root@localhost lib64]# ln -s /usr/local/cuda-6.5/lib64/libnppi.so.6.5 /usr/local/lib/libnppi.so
        [root@localhost lib64]# ln -s /usr/local/cuda-6.5/lib64/libnppc.so.6.5 /usr/local/lib/libnppc.so
        [root@localhost lib64]# ln -s /usr/local/cuda-6.5/lib64/libcudart.so.6.5 /usr/local/lib/libcudart.so

        在运行时,报错:

        error while loading shared libraries: libcufft.so.6.5: cannot open shared object file: No such file or directory

        error while loading shared libraries: libnpps.so.6.5: cannot open shared object file: No such file or directory

        error while loading shared libraries: libnppi.so.6.5: cannot open shared object file: No such file or directory

        error while loading shared libraries: libnppc.so.6.5: cannot open shared object file: No such file or directory

        error while loading shared libraries: libcudart.so.6.5: cannot open shared object file: No such file or directory

        

          解决方法,参考这里

        When I run testing routine, facing error: error while loading shared libraries: libcudart.so.6.5: cannot open shared object file: No such file or directory.

        Solution for this, copy respect library to /usr/local/lib:

        sudo cp /usr/local/cuda-6.5/lib64/libcudart.so.6.5 /usr/local/lib/libcudart.so.6.5 && sudo ldconfig

        sudo cp /usr/local/cuda-6.5/lib64/libcublas.so.6.5 /usr/local/lib/libcublas.so.6.5 && sudo ldconfig

        sudo cp /usr/local/cuda-6.5/lib64/libcurand.so.6.5 /usr/local/lib/libcurand.so.6.5 && sudo ldconfig

        

        最后的结果是能够成功进行识别,但是准确率较windows下有所下降,唯一的区别是,linux用的opencv是2.4.9,而windows中使用的2.4.10.

        

  • 相关阅读:
    Python Post四种请求方式
    Python 字符串转Base64编解码
    JS 数组遍历
    FineUI MVC 前端获取表格Json通过Post传递后台
    C# Json转DataTable
    MSSQL 关联更新
    Python selenium Message: session not created: This version of ChromeDriver only supports Chrome version 76
    FineUI MVC 同级新增页签
    Tomcat Tomcat的中文乱码设置
    zabbix-4.0-监控服务器的ping告警设置
  • 原文地址:https://www.cnblogs.com/Crysaty/p/6645001.html
Copyright © 2020-2023  润新知