• 图像处理03


    今天跟小老板讨论了一下,流程上基本上确定了,现在就差先简单实现一下然后看看有什么大方向的问题了。

    首先是把之前测试的opencv的代码封装成方法,这样就可以调用了。

    然后就是找一个靠谱的ocr库。

    结果我就败在这儿了…………

    这里使用了google的tesseract库,安装方法如下:

    1. 下载安装包。https://github.com/tesseract-ocr/tesseract/wiki/Downloads

    2. 安装python库。 pip install pytesseract

    然后就可以开搞了。

    这里注意两个问题:

    一个是需要在python代码里加一句话指明你安装的tesseract的位置。

    pytesseract.pytesseract.tesseract_cmd = r'C:Program Files (x86)Tesseract-OCR	esseract.exe'

    另一个是需要在系统环境变量里添加TESSDATA_PREFIX项:C:Program Files (x86)Tesseract-OCR essdata

    最后就是,环境变量改了之后记得重启pycharm………………

    还有一个点就是要把opencv中图片的格式np.ndarray转成tessdata要的格式(这里应该是PIL实现的)。

    然后就可以试试看效果了。

    结果效果真的差………………

    不开心……

    看来至少要找一个还可以的方法先对付上…………我决定看一下师姐的代码是怎么实现的,大概……

    今天的代码如下:

    import sys
    import os
    
    import cv2
    import numpy as np
    from PIL import Image
    import pytesseract
    
    
    def td_test(img):
        print("test")
        print('
    textdetection.py')
        print('       A demo script of the Extremal Region Filter algorithm described in:')
        print('       Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
    ')
    
        # if (len(sys.argv) < 2):
        #   print(' (ERROR) You must call this script with an argument (path_to_image_to_be_processed)
    ')
        #   quit()
    
        # pathname = os.path.dirname(sys.argv[0])
        # pathname = os.path.dirname('D:/MyProject/PyCharm/testcode')
    
        # img      = cv2.imread(str(sys.argv[1]))
        # img = cv2.imread('test.jpg')
        # for visualization
        # vis = img.copy()
    
        # Extract channels to be processed individually
        channels = cv2.text.computeNMChannels(img)
        # Append negative channels to detect ER- (bright regions over dark background)
        cn = len(channels) - 1
        for c in range(0, cn):
            channels.append((255 - channels[c]))
    
        # Apply the default cascade classifier to each independent channel (could be done in parallel)
        print("Extracting Class Specific Extremal Regions from " + str(len(channels)) + " channels ...")
        print("    (...) this may take a while (...)")
        answer = []
        for channel in channels:
            # erc1 = cv2.text.loadClassifierNM1(pathname+'/trained_classifierNM1.xml')
            erc1 = cv2.text.loadClassifierNM1('trained_classifierNM1.xml')
            er1 = cv2.text.createERFilterNM1(erc1, 16, 0.00015, 0.13, 0.2, True, 0.1)
    
            # erc2 = cv2.text.loadClassifierNM2(pathname+'/trained_classifierNM2.xml')
            erc2 = cv2.text.loadClassifierNM2('trained_classifierNM2.xml')
            er2 = cv2.text.createERFilterNM2(erc2, 0.5)
    
            regions = cv2.text.detectRegions(channel, er1, er2)
    
            rects = cv2.text.erGrouping(img, channel, [r.tolist() for r in regions])
            # rects = cv2.text.erGrouping(img,channel,[x.tolist() for x in regions], cv2.text.ERGROUPING_ORIENTATION_ANY,'../../GSoC2014/opencv_contrib/modules/text/samples/trained_classifier_erGrouping.xml',0.5)
    
            # print(rects);
            # print(np.shape(rects)[0]);
    
            # Visualization
            for r in range(0, np.shape(rects)[0]):
                rect = rects[r]
                answer.append(rect)
                # print(rect)
                # cv2.rectangle(vis, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (0, 0, 0), 2)
                # cv2.rectangle(vis, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (255, 255, 255), 1)
    
        # Visualization
        # cv2.imshow("Text detection result", vis)
        # cv2.waitKey(0)
    
        return answer
    
    if __name__ == '__main__':
        img = cv2.imread('test.jpg')
        vis = img.copy()
        answer = td_test(img)
        pytesseract.pytesseract.tesseract_cmd = r'C:Program Files (x86)Tesseract-OCR	esseract.exe'
        for rect in answer:
            print(rect)
            cv2.rectangle(vis, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (0, 0, 0), 2)
            cv2.rectangle(vis, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (255, 255, 255), 1)
            img1 = vis[rect[1]:rect[1] + rect[3], rect[0]:rect[0] + rect[2]]
            # (thresh, img1) = cv2.threshold(img1, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
            img2 = Image.fromarray(img1)
            txt = pytesseract.image_to_string(img2)
            print(txt)
            # cv2.imshow("test", img1)
            # cv2.waitKey(0)
        cv2.imshow("Text detection result", vis)
        cv2.waitKey(0)

    今天的工作参考了以下网页的内容:

    https://pypi.python.org/pypi/pytesseract

    https://stackoverflow.com/questions/30404756/how-to-pass-opencv-image-to-tesseract-in-python (这个是关于把opencv的np.ndarray转化为tesseract需要的图片格式的方法)

    https://testerhome.com/topics/4615

    https://github.com/upupnaway/digital-display-character-rec (这个是一个opencv和tesseract实现的文字提取,但是它文字提取的方法是用的腐蚀膨胀…………我有一种它的方法没准比我现在用的效果要好的预感……明天试试好了)

    http://www.cnblogs.com/syqlp/p/5462459.html (这个是一个tesseract的例子,我觉得我提取文字位置的方法真的不靠谱…………但是真的有靠谱的方法么?好气啊)

    https://stackoverflow.com/questions/14800730/tesseract-running-error (这个是tesseract的运行错误,但是并不是我过程中出错的原因)

    http://blog.csdn.net/liqiancao/article/details/55670749 (剪裁图片参考了这个)

    http://www.cnblogs.com/hupeng1234/p/7136442.html (这个说tesseract的错误说的很细)

    大概是这样。

  • 相关阅读:
    Android视图动画集合AndoridViewAnimations
    调整窗口大小时进行页面刷新(设定定时器)
    通过拖动来变换元素顺序
    jquery的input:type=file实现文件上传
    ajax请求的data数据格式
    Cookie的新增,设置与清除
    比较两个对象是否一样的代码
    5.3日,7:20开始 阮一峰js的早课学习
    在线代码编辑器使用案例代码
    layui当点击增加的时候,将form中的值获取的添加到table行中代码
  • 原文地址:https://www.cnblogs.com/wangzhao765/p/7764268.html
Copyright © 2020-2023  润新知