OCR Server: Detect text from images

DMI developed a simple Flask based API that runs pre trained Optical Character Recognition (OCR) models on provided images and returns the detected text in location based groups.

Quick Setup

The OCR Server runs in a Docker container.

Install Docker Desktop, and start it.
Clone the OCR Server repository (e.g. git clone https://github.com/digitalmethodsinitiative/ocr_server.git)
(Optional) Update or change any settings in the config.yml file
In a terminal/command prompt, navigate to the folder in which you just cloned OCR server (the folder that contains the config.yml file)
Run docker build -t ocr_server .

This will create a Docker image called ocr_server and may take a while to download and install necessary packages

Run docker run --publish 4000:80 --name ocr_server --detach ocr_server

This creates a running container of the ocr_server image
--publish 4000:80 opens port 4000 on your machine and connects it to port 80 in the container; you may update 4000 to any port you wish
Add a restart policy such as --restart unless-stopped and the OCR container will restart if host server is rebooted, Docker crashes, etc.

Usage

DMI primarily designed the OCR Server to work as a processor with 4CAT. Add the hosted server address (http://wherever:4000/api/detect_text) to 4CAT Settings in the "OCR: Text from images" section and the processor should appear for any dataset of images.

The OCR Server can also be used independently. It is essentially just an API that can be accessed via python requests, curl, or any other framework.

Python for example:

import requests
server = 'http://localhost:4000/'
filename = 'any/dir/to/image.jpg'

with open(filename, "rb") as infile:
    api_response = requests.post(server + 'api/detect_text', files={'image': infile})
    # To specify a model type, you can use `paddle_ocr` or `keras_ocr` like so
    #api_response = requests.post(server + 'api/detect_text', files={'image': infile}, data={'model_type': 'paddle_ocr'})

The api_response should return a 200 status code and a JSON object containing the filename and simplified_text which consists of a collection of groupings and the raw_text alone.

Available OCR models

Currently, the OCR Server has two available models that can be selected.

PaddleOCR: The PaddleOCR package provides access to a number of different OCR models. We currently only support the english PP-OCRv3 model, but adding support for other languages is possible if there is a desire (and they exist).
Keras-OCR: The keras-ocr package (Keras OCR Documentation) first detects areas of an image that may contain text with the pretrained Character-Region Awareness For Text (CRAFT) text detection model and then attempts to predict the text inside each area using Keras' implementation of a Convolutional Recurrent Neural Network (CRNN) model for text recognition. Once words are predicted, we developed an algorithm to attempt to sort the text into likely groupings based on locations within the original image.

Helpful Docker commands

View container logs docker container logs container_name
Stop running container docker stop container_name
Start stopped container docker start container_name
Connect to container command line docker exec -it container_name /bin/bash
Remove container docker container rm container_name Useful to remove then recreate with new parameters (e.g. port mappings)
Remove image docker image rm image_name:image_tag Useful if you need to change Dockerfile and rebuild
- Note: must also remove any containers dependent on image; you could alternately create a new image with a different name:tag
Copy files into container docker cp path/to/file container_name:/app/path/to/desired/directory/ Can update and change files (e.g. config.py or other configuration files) Note: may require restarting the container to take effect

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
common		common
helpers		helpers
ocr_detection		ocr_detection
server		server
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cli_interface.py		cli_interface.py
config.yml		config.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR Server: Detect text from images

Quick Setup

Usage

Available OCR models

Helpful Docker commands

About

Releases

Packages

Languages

digitalmethodsinitiative/ocr_server

Folders and files

Latest commit

History

Repository files navigation

OCR Server: Detect text from images

Quick Setup

Usage

Available OCR models

Helpful Docker commands

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages