• 用于文本识别的合成数据生成器


    https://github.com/Belval/TextRecognitionDataGenerator

    A synthetic data generator for text recognition

    说明:

    功能与上篇博客介绍的文本图片生成类似。

    安装相关的依赖后,按要求即可以运行demo。

    可以生成自己所希望的语料的文本,也可以添加自己所需要的背景。

    例如,火车票信息,可以将所有可能的车站名称、车次名称、等一些固定的信息都放在里面,随机生成需要的样本数据。

    python run.py -l cn --output_dir MY_samples -i texts/city.txt -c 1000 -b 3 -w 5

    另外,对于中文字体(黑体、宋体...),如何修改还在探索。

    生成样本如下图:

    TextRecognitionDataGenerator TravisCI codecov

    A synthetic data generator for text recognition

    What is it for?

    Generating text image samples to train an OCR software. Now supporting non-latin text!

    What do I need to make it work?

    I use Archlinux so I cannot tell if it works on Windows yet.

    Python 3.X
    OpenCV 3.2 (It probably works with 2.4)
    Pillow
    Numpy
    Requests
    BeautifulSoup
    tqdm
    

    You can simply use pip install -r requirements.txt too.

    New

    • Specify text color range using -tc min,max
    • Explicit alignement when using -al with fixed width (0: Left, 1: Center, 2: Right)
    • Fixed width using -wd
    • Generate random strings with letters, numbers and symbols (Thank you @FHainzl)
    • Save the labels in a file instead of in the file name (Thank you @FHainzl)
    • Add support for Simplified and Traditional Chinese

    How does it work?

    python run.py -w 5 -f 64

    You get 1000 randomly generated images with random text on them like:

    1 2 3 4 5

    What if you want random skewing? Add -k and -rk (python run.py -w 5 -f 64 -k 5 -rk)

    67 8910

    But scanned document usually aren't that clear are they? Add -bl and -rbl to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):

    11 12 13 14

    Maybe you want another background? Add -b to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or picture (3).

    15 16 17 23

    When using picture background (3). A picture from the pictures/ folder will be randomly selected and the text will be written on it.

    Or maybe you are working on an OCR for handwritten text? Add -hw! (Experimental)

    1819202122

    It uses a Tensorflow model trained using this excellent project by Grzego.

    The project does not require TensorFlow to run if you aren't using this feature

    You can also add distorsion to the generated text with -d and -do

    23 24 25

    The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg

    There are a lot of parameters that you can tune to get the results you want, therefore I recommand checking out python run.py -h for more informations.

    How to create images with Chinese (both simplified and traditional) text

    It is simple! Just do python run.py -l cn -c 1000 -w 5!

    Unfortunately I do not speak Chinese so you may have to edit texts/cn.txt to include some meaningful words instead of random glyphs.

    Here are examples of what I could make with it:

    Traditional:

    27

    Simplified:

    28

    Can I add my own font?

    Yes, the script picks a font at random from the fonts directory.

      
    fonts/latin English, French, Spanish, German
    fonts/cn Chinese
       

    Simply add / remove fonts until you get the desired output.

    If you want to add a new non-latin language, the amount of work is minimal.

    1. Create a new folder with your language two-letters code
    2. Add a .ttf font in it
    3. Edit run.py to add an if statement in load_fonts()
    4. Add a text file in dicts with the same two-letters code
    5. Run the tool as you normally would but add -l with your two-letters code

    It only supports .ttf for now.

    Benchmarks

    • Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
      • -t 1 : 363 img/s
      • -t 2 : 694 img/s
      • -t 4 : 1300 img/s
      • -t 8 : 1500 img/s
    • AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
      • -t 1 : 558 img/s
      • -t 2 : 1045 img/s
      • -t 4 : 2107 img/s
      • -t 8 : 3297 img/s

    Contributing

    1. Create an issue describing the feature you'll be working on
    2. Code said feature
    3. Create a pull request

    Feature request & issues

    If anything is missing, unclear, or simply not working, open an issue on the repository.

    What is left to do?

    • Better background generation
    • Better handwritten text generation
    • More customization parameters (mostly regarding background)
  • 相关阅读:
    JVM致命错误日志(hs_err_pid.log)分析
    JVM调优-命令大全(jps jstat jmap jhat jstack jinfo
    GC日志分析详解
    简单的学习,实现,领域事件,事件存储,事件溯源
    学习DDD的初步尝试,从最基础的开始,业务介绍,划分限界上下文 ,建立模型
    .Net Core + DDD基础分层 + 项目基本框架 + 个人总结
    第三节:使用Log4net和过滤器记录异常信息,返回异常给前端
    从一层到多层架构的学习笔记
    学习服务之间的调用,三个方法的演化
    .Net Core3.0 WEB API 中使用FluentValidation验证,实现批量注入
  • 原文地址:https://www.cnblogs.com/Allen-rg/p/9774080.html
Copyright © 2020-2023  润新知