win版tesseract安装
环境:
win7 x64
python-2.7.12
tesseract-ocr-setup-3.02.02
官方网站:
第三方windows release https://github.com/UB-Mannheim/tesseract/wiki
pip install Pillow pytesseract
命令行识别测试
验证码识别python示例:
#!/usr/bin/env
python
# -*-
coding: utf-8 -*-
from
PIL
import
Image
import
pytesseract
import
urllib2
import
ssl
picture
=
'https://vcs.suning.com/vcs/imageCode.htm?uuid=1e68d06a-1134-410b-9606-f0eb4ae23bbe'
ssl._create_default_https_context
=
ssl._create_unverified_context
image
=
Image.open(urllib2.urlopen(picture))
image.show()
captcha
=
pytesseract.image_to_string(image)
print(captcha)
报错1:
解决:
将tesseract安装目录加入PATH路径
操作略
报错2:
raise TesseractError(status,
errors)
TesseractError: (1, 'Error opening data
file ./tessdata/eng.traineddata')
解决:
下载对应版本的data文件解压到指定目录C:Program Files
(x86)Tesseract-OCR essdata,有重名的直接覆盖
本人实测,不报错,但识别结果为空,需要进一步研究
源码安装请参看