OCR stands for optical character recognition, and it’s a method of converting text into an image or image file. It is used to convert text into a machine-readable format that can be searched and analyzed.

Description:-

The advancement in the field of technology never ceases to amaze us. Long back, none of us

could predict that the machines like computers, tablets, and mobile phones would read the text for us. Technologies like Artificial Intelligence and Machine Learning have made it possible to train computers to read text accurately.

Our mobile phones are able to read text through a process called optical character recognition (OCR). This technology uses a camera to scan text and machine learning algorithms to recognize characters and words. The text can be recognized with high accuracy, and the recognized text can be converted into an editable format such as a Word document or a text file.

OCR technology is used in many applications such as passport recognition, bank check processing, automated data entry, scanning and indexing of documents, and much more.

Today, we have come up with a stepwise guide to learn how to use Tesseract OCR library and pytesseract wrapper for optical character recognition (OCR) for converting text in images into digital text in Python.

Tesseract library contains an OCR engine and a command-line program, so it has nothing to do with Python.

Note: Please follow their official guide for installation, as it is a required tool for this tutorial.

We will be using pytesseract module for Python which is a wrapper for the Tesseract-OCR engine so that we can access it via Python.

Applications of Optical Character Recognition (OCR)

OCR has plenty of applications in today’s business. Some of them are listed below:

  • Automated Data Entry
  • Passport recognition in Airports
  • Document Archiving
  • License plates recognition
  • Extracting business card information into a contact list
  • Converting handwritten documents into electronic images
  • Creating Searchable PDFs
  • Language Translation
  • Create audible files (text to audio)

How to use Tesseract OCR library and pytesseract wrapper for optical character recognition (OCR)

Step 1. Install pytesseract library. For installation run the given command in the terminal of your Python IDE (Integrated Development Environment).

Copy to Clipboard

Step 2. Make a new Python file and import the necessary libraries.

Copy to Clipboard

What is Tesseract Model?

Tesseract is an optical character recognition engine for various operating systems. It is free software released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.

Step 3. Download Tesseract Model from the given link.

https://tesseract-ocr.github.io/tessdoc/Downloads.html

Step 4. Now, define tesseract file with the pytesseract command.

Copy to Clipboard

Step 5. If you click the ctrl +left mouse button on import pytesseract, you will see the available function for converting an image into text.

Copy to Clipboard

Step 6. First, we will see how we can convert an image into string.

Copy to Clipboard

Step 7. Now, find the location of text in the window.

Copy to Clipboard

Output:-

Note: image.shape has three variables: height, width & dimension.

Step 8. After this, iterate the box with for loop to find each character’s location in the image.

Copy to Clipboard

Output:-

Step 9. Define x, y position with height and width to put a rectangle on each character.

Copy to Clipboard

Output:-

Step 10. Now, convert images to data or find each word in the image. Also, append the text with a “text” list.

Copy to Clipboard

Note: You can also convert these outputs into other languages for different countries.

Output:-

Conclusion:-

With this, we have come to the end of this tutorial on Optical Character Recognition (OCR) in Python. The success of Optical Character Recognition in Python depends on a variety of factors including the quality of the input image, the accuracy of the recognition algorithm, and the complexity of the text being recognized. OCR in Python can be a powerful tool for automating text recognition tasks, and it can be used for a variety of applications. With the right tools, the right approach, and the right data, OCR in Python can be a useful tool for improving data accuracy and reducing the time needed for manual text recognition. Hope you were able to understand the tutorial.

Do you wish to know more about IOT – visit Learning Bix