OCR Scanner Text from Image in python

introduction

Optical Character Recognition (OCR) is a powerful technology that enables computers to recognize and extract text from images, scanned documents, or handwritten notes. This technology is widely used in digitizing printed or handwritten documents, automating data entry processes, reading license plates, assisting the visually impaired, and even translating text from photographs.

In this project, we build an interactive OCR Scanner Text from Image in python with a Graphical User Interface (GUI) using Python’s built-in Tkinter library. The core of the OCR functionality is powered by Tesseract OCR, an open-source OCR engine developed by Google. We use the pytesseract Python wrapper to interface with the Tesseract engine seamlessly.

The GUI allows users to select an image from their computer using a file dialog, displays the selected image within the window, and automatically extracts and displays the text content in a scrollable textbox. This makes the tool highly user-friendly and interactive, especially for non-technical users who prefer graphical tools over command-line scripts.

Unlike console applications or web-based tools that require complex setups, this GUI-based OCR tool is lightweight, runs locally, and doesn’t require an internet connection once dependencies are installed. It’s particularly useful for educators, researchers, students, and professionals who need to extract text from screenshots, scanned notes, book pages, or forms.

Project Objectives:

To create a simple, intuitive GUI application for OCR.
To allow users to select and preview images easily.
To extract text content from the image using Tesseract OCR.
To display extracted text in a readable, editable textbox.

Technologies Used:

Library	Purpose
`Tkinter`	GUI development (native Python library)
`Pillow (PIL)`	Handling and resizing image files
`pytesseract`	Python wrapper for Tesseract OCR engine
`Tesseract OCR`	Actual OCR engine for extracting text

Key Features:

Image selection via file dialog
Image preview inside the app window
Real-time text extraction using OCR
Editable text output in a scrollable textbox
Error handling for invalid or unreadable images

Target Audience:

Students and researchers who digitize handwritten or printed notes.
Developers learning GUI and OCR integration in Python.
Office professionals needing quick text extraction tools.
Anyone looking for a desktop-based image-to-text converter.

Learning Outcomes:

How to build GUI applications using Tkinter
How to use pytesseract for text recognition from images
How to integrate image selection, display, and result output in a GUI
Understanding the basics of OCR and its use cases

steps to create OCR scanner text from image in python

Step 1: Install Required Libraries

You need to install the following Python libraries if they are not already installed:

!pip install pytesseract pillow

Also, make sure Tesseract OCR engine is installed on your system:

Windows: Download and install from https://github.com/tesseract-ocr/tesseract
Linux/macOS: Install using sudo apt install tesseract-ocr

Optionally, set the Tesseract executable path in your code:

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

Step 2: Import Required Modules

Import the necessary Python libraries:

import tkinter as tk
from tkinter import filedialog, messagebox
from PIL import Image, ImageTk
import pytesseract

Step 3: Define the Image Selection and OCR Function

Create a function to select an image, display it, and extract text using pytesseract:

def select_image():
    filepath = filedialog.askopenfilename(filetypes=[("Image Files", "*.png *.jpg *.jpeg *.bmp")])
    if filepath:
        img = Image.open(filepath)
        img.thumbnail((300, 300))
        img_tk = ImageTk.PhotoImage(img)
        panel.config(image=img_tk)
        panel.image = img_tk
        result.delete("1.0", tk.END)
        try:
            text = pytesseract.image_to_string(img)
            result.insert(tk.END, text)
        except:
            messagebox.showerror("OCR Failed", "Text could not be extracted.")

Step 4: Create the Main GUI Window

Set up the root window for your application:

root = tk.Tk()
root.title("OCR Scanner (Image to Text)")
root.geometry("400x500")
root.resizable(False, False)

Step 5: Add GUI Widgets

Add a button to trigger the image selection, a label to display the image, and a text box to show the extracted text:

btn = tk.Button(root, text="Select Image", command=select_image, bg="#4CAF50", fg="white", padx=10, pady=5)
btn.pack(pady=10)

panel = tk.Label(root)
panel.pack()

result = tk.Text(root, height=10, wrap=tk.WORD, padx=10, pady=10)
result.pack(pady=10, padx=10, fill=tk.BOTH)

Step 6: Run the Application

Start the Tkinter event loop to make the GUI responsive:

root.mainloop()

code explanation

This Python script builds a Graphical User Interface (GUI) that:

Allows the user to select an image file (JPG, PNG, etc.)
Uses Tesseract OCR (via pytesseract) to extract text from the image
Displays the extracted text inside the GUI

1. Install the Required Package

!pip install pytesseract

Installs pytesseract, the Python wrapper for Google’s Tesseract OCR engine.
This only needs to be run once (in Jupyter).

2. Import Required Libraries

import tkinter as tk
from tkinter import filedialog, messagebox
from PIL import Image, ImageTk
import pytesseract

tkinter is used to build the GUI.
filedialog lets users select files from their system.
messagebox shows pop-up error alerts.
PIL.Image opens image files.
ImageTk converts PIL images to a Tkinter-compatible format.
pytesseract performs Optical Character Recognition on images.

3. (Optional) Link Tesseract Path

# pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

On Windows, Tesseract needs to be manually installed.
This line tells pytesseract where to find tesseract.exe.

4. Function to Select Image and Extract Text

def select_image():
    filepath = filedialog.askopenfilename(filetypes=[("Image Files", "*.png *.jpg *.jpeg *.bmp")])

Opens a dialog to select image files only.

    if filepath:
        img = Image.open(filepath)             # Open the selected image
        img.thumbnail((300, 300))              # Resize to fit GUI
        img_tk = ImageTk.PhotoImage(img)       # Convert to Tkinter image format
        panel.config(image=img_tk)             # Display image in label
        panel.image = img_tk                   # Prevent image from being garbage-collected

        result.delete("1.0", tk.END)           # Clear previous text

        try:
            text = pytesseract.image_to_string(img)  # Perform OCR
            result.insert(tk.END, text)              # Show text in text box

        except:
            messagebox.showerror("OCR Failed", "Text could not be extracted.")

If OCR fails (e.g., blurry image or corrupted file), show an error message.

5. Create GUI Main Window

root = tk.Tk()
root.title("OCR Scanner (Image to Text)")
root.geometry("400x500")
root.resizable(False, False)

Initializes the main window.
Sets window title, fixed size (400×500 pixels), and disables resizing.

6. Add GUI Components

btn = tk.Button(root, text="Select Image", command=select_image, bg="#4CAF50", fg="white", padx=10, pady=5)
btn.pack(pady=10)

Button to trigger file selection.
Styled with green background (#4CAF50) and padding.

panel = tk.Label(root)
panel.pack()

Empty label to display the selected image.

result = tk.Text(root, height=10, wrap=tk.WORD, padx=10, pady=10)
result.pack(pady=10, padx=10, fill=tk.BOTH)

A multi-line textbox to show the extracted text.
Supports word-wrapping and fills available space.

7. Run the App

root.mainloop()

Starts the GUI event loop.
Keeps the window running and responsive until closed.

source code

				
					# First, install the required package
!pip install pytesseract

import tkinter as tk
from tkinter import filedialog, messagebox
from PIL import Image, ImageTk
import pytesseract

# (Optional) Windows path for tesseract executable
# pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

def select_image():
    filepath = filedialog.askopenfilename(filetypes=[("Image Files", "*.png *.jpg *.jpeg *.bmp")])
    if filepath:
        img = Image.open(filepath)
        img.thumbnail((300, 300))
        img_tk = ImageTk.PhotoImage(img)
        panel.config(image=img_tk)
        panel.image = img_tk  # Keep reference
        result.delete("1.0", tk.END)
        try:
            text = pytesseract.image_to_string(img)
            result.insert(tk.END, text)
        except:
            messagebox.showerror("OCR Failed", "Text could not be extracted.")

# Create GUI Window
root = tk.Tk()
root.title("OCR Scanner (Image to Text)")
root.geometry("400x500")
root.resizable(False, False)

# GUI Components
btn = tk.Button(root, text="Select Image", command=select_image, bg="#4CAF50", fg="white", padx=10, pady=5)
btn.pack(pady=10)

panel = tk.Label(root)
panel.pack()

result = tk.Text(root, height=10, wrap=tk.WORD, padx=10, pady=10)
result.pack(pady=10, padx=10, fill=tk.BOTH)

root.mainloop()