Image to text python

Question

asked Feb 4, 2021 in Python by laddulakshana (16.4k points)

I'm utilizing python 3.x and utilizing the accompanying code to change over picture into text:

from PIL import Image
from pytesseract import image_to_string
image = Image.open('image.png', mode='r')
print(image_to_string(image))

I'm getting this error:

Traceback (most recent call last):
File "C:/Users/hp/Desktop/GII/Image_to_text.py", line 12, in <module>
print(image_to_string(image))
File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
config=config)
File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
stderr=subprocess.PIPE)
File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__
restore_signals, start_new_session)
File "C:\Users\hp\Downloads\WinPython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

If it's not too much trouble, note that I have placed the picture in a similar directory where my python is available. Additionally It doesn't raise error on image = Image.open('image.png', mode='r') however it raises on the line print(image_to_string(image)).

Anyone help me where I went wrong?

1 Answer

hari_sh · Answer 1 · 2021-02-04T10:09:38+0000

You must have tesseract installed and accessible in your path.

pytesseract is simply a covering for subprocess.Popen with tesseract binary as a binry to run. It doesn't play out any sort of OCR itself.

Applicable piece of sources:

def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False, config=None):
'''
runs the command:
`tesseract_cmd` `input_filename` `output_filename_base`
returns the exit status of tesseract, as well as tesseract's stderr output
'''
command = [tesseract_cmd, input_filename, output_filename_base]
if lang is not None:
command += ['-l', lang]
if boxes:
command += ['batch.nochop', 'makebox']
if config:
command += shlex.split(config)
proc = subprocess.Popen(command,
stderr=subprocess.PIPE)
return (proc.wait(), proc.stderr.read())

Citing another piece of source:

# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'

So speedy method of changing tesseract path would be:

import pytesseract
pytesseract.tesseract_cmd = "/absolute/path/to/tesseract" # this should be done only once
pytesseract.image_to_string(img)

Interested to learn the concepts of Python in detail? Come and join the python course to gain more knowledge in Python

Image to text python

1 Answer

Related questions

Browse Categories