• OCR 01: EasyOCR


    Catalog

    Related Links

    Installation

    Install python3 and pip3

    sudo apt install python3-pip
    

    Install EasyOCR, this will take a long time for downloading around 1GiB files

    q3w:~$ pip install easyocr
    Defaulting to user installation because normal site-packages is not writeable
    Collecting easyocr
      WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f42ce149930>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /packages/bc/7f/389e1a886ff219682b5a56ea84f91ed785999665ac9ec1f220c7fdcd150f/easyocr-1.6.2-py3-none-any.whl
      Downloading easyocr-1.6.2-py3-none-any.whl (2.9 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 481.8 kB/s eta 0:00:00
    Collecting torch
      Downloading torch-1.12.1-cp310-cp310-manylinux1_x86_64.whl (776.3 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 776.3/776.3 MB 404.3 kB/s eta 0:00:00
    Collecting scipy
      Downloading scipy-1.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.7 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33.7/33.7 MB 673.8 kB/s eta 0:00:00
    Collecting pyclipper
      Downloading pyclipper-1.3.0.post3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (813 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 813.8/813.8 KB 650.2 kB/s eta 0:00:00
    Collecting torchvision>=0.5
      Downloading torchvision-0.13.1-cp310-cp310-manylinux1_x86_64.whl (19.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 704.0 kB/s eta 0:00:00
    Collecting ninja
      Downloading ninja-1.10.2.4-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (120 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.7/120.7 KB 667.3 kB/s eta 0:00:00
    Collecting opencv-python-headless<=4.5.4.60
      Downloading opencv_python_headless-4.5.4.60-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (47.6 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.6/47.6 MB 659.3 kB/s eta 0:00:00
    Collecting numpy
      Downloading numpy-1.23.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 686.0 kB/s eta 0:00:00
    Requirement already satisfied: PyYAML in /usr/lib/python3/dist-packages (from easyocr) (5.4.1)
    Collecting Shapely
      Downloading Shapely-1.8.5.post1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.0 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 679.9 kB/s eta 0:00:00
    Collecting scikit-image
      Downloading scikit_image-0.19.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.9 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.9/13.9 MB 708.9 kB/s eta 0:00:00
    Requirement already satisfied: Pillow in /usr/lib/python3/dist-packages (from easyocr) (9.0.1)
    Collecting python-bidi
      Downloading python_bidi-0.4.2-py2.py3-none-any.whl (30 kB)
    Collecting typing-extensions
      Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
    Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from torchvision>=0.5->easyocr) (2.25.1)
    Requirement already satisfied: six in /usr/lib/python3/dist-packages (from python-bidi->easyocr) (1.16.0)
    Collecting tifffile>=2019.7.26
      Downloading tifffile-2022.10.10-py3-none-any.whl (210 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 210.3/210.3 KB 579.4 kB/s eta 0:00:00
    Collecting PyWavelets>=1.1.1
      Downloading PyWavelets-1.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.8 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.8/6.8 MB 699.7 kB/s eta 0:00:00
    Collecting networkx>=2.2
      Downloading networkx-2.8.7-py3-none-any.whl (2.0 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 686.2 kB/s eta 0:00:00
    Collecting imageio>=2.4.1
      Downloading imageio-2.22.2-py3-none-any.whl (3.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 675.2 kB/s eta 0:00:00
    Collecting packaging>=20.0
      Downloading packaging-21.3-py3-none-any.whl (40 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.8/40.8 KB 1.1 MB/s eta 0:00:00
    Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/lib/python3/dist-packages (from packaging>=20.0->scikit-image->easyocr) (2.4.7)
    ...
    

    Usage

    It will download the trained data in the first run

    $ python3
    Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import easyocr
    >>> reader = easyocr.Reader(['ch_sim','en'])
    CUDA not available - defaulting to CPU. Note: This module is much faster with a GPU.
    Downloading detection model, please wait. This may take several minutes depending upon your network connection.
    Progress: |██████████████████████████████████████████████████| 100.0% CompleteDownloading recognition model, please wait. This may take several minutes depending upon your network connection.
    Progress: |██████████████████████████████████████████████████| 100.0% Complete>>> 
    

    Recognize

    # 带坐标
    result = reader.readtext('Documents/fp01.png')
    
    # 不带坐标, 合并相邻text box
    result = reader.readtext('Documents/tu01.jpg', detail = 0, paragraph=True)
    print(result)
    

    Performance

    • Speed is slow when using CPU
    • The correct rate is good when extracting text from e-print or screenshot pictures
    • The correct rate drops a lot when handling the photos taken by a cellphone

    Reference

  • 相关阅读:
    病毒写法,资源的释放.
    MinHook库的使用 64位下,过滤LoadLibraryExW
    系统权限远程线程注入到Explorer.exe
    【Unity】4.5 树木创建器
    【Unity】4.4 添加角色控制器
    【Unity】4.3 地形编辑器
    【Unity】4.2 提升开发效率的捷径--导入 Unity 5.3.4 自带的资源包
    【Unity】4.1 创建组件
    【Unity】4.0 第4章 创建基本的游戏场景
    【Unity】3.6 导入图片资源
  • 原文地址:https://www.cnblogs.com/milton/p/16844171.html
Copyright © 2020-2023  润新知