Paragraph Detection In Image Python Using OpenCV

This a classic use for cv2.dilate(). Essentially when you want to connect items together, you can dilate them to join multiple items into a single item. Here’s a simple approach

  • Convert image to grayscale and Gaussian Blur
  • Adaptive threshold
  • Dilate to connect adjacent words together
  • Find contours and draw bounding box
import cv2import numpy as npimage = cv2.imread('test.png')gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)blur = cv2.GaussianBlur(gray, (7,7), 0)thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))dilate = cv2.dilate(thresh, kernel, iterations=4)cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)cnts = cnts[0] if len(cnts) == 2 else cnts[1]for c in cnts:    x,y,w,h = cv2.boundingRect(c)    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)cv2.imshow('th', thresh)cv2.imshow('dilated', dilate)cv2.imshow('image', image)cv2.waitKey()

Adaptive thresholdenter image description here

Here’s where the magic happens. We can assume that a paragraph is a section of words that are close together, to achieve this we dilate to connect adjacent words.enter image description here

Resultenter image description here