OCR Preprocessing – Finding ROI

OCR Preprocessing – Finding ROI

Finding ROI in image

OCR is short for Optical Character Recognition, this is the technology used to identify the text from images. As is with any other computer vision related project, the first thing to carry out OCR is that you need a clean image of the text document. The lighting conditions and resolution of the text in the image play a big role in carrying out OCR successfully. Once you have that you need break down the complete process into multiple parts and then solve them independently. I have described below some steps that need to be done for successfully carrying out OCR

Steps to be done for OCR

  1. Finding Region of Interest (ROI)
  2. Cleanup / rotate / transform the located document
  3. Segmentation of relevant area of image
  4. Carry out OCR

Continue reading “OCR Preprocessing – Finding ROI”

Object (Hands) detection and tracking in video – Multiple approach comparison

Object (Hands) detection and tracking in video – Multiple approach comparison

Object-Tracking-comparison

There are various uses of being able to identify and locate object (hands) in an image. For example, if we can successfully detect and localize the hands in image (and video) we can definitely use this for gesture recognition and carry out multiple operations based on the same. Some of the oldest and working application of this kind of technology that I can recall are PS3 or MS kinect based games. PS3 used a camera and movement controllers whereas Kinect did not use any movement controller they carried out skeletal tracking of body itself.

Though we may apply the algorithm for object detection on images, but actual object recognition will be useful only if it is really performant so that it can work on real time video input. Alongside it being superfast the algorithm needs to work for different users and different locations and different lighting conditions. In the section that follows I will discuss different options that we have available and which ones can be useful based on the criteria we have defined above.

Continue reading “Object (Hands) detection and tracking in video – Multiple approach comparison”

Detecting lines in image with OpenCV – Hough Line Transform

Detecting lines in image with OpenCV – Hough Line Transform

There are times when you need to find straight lines on an image. This can be done using OpenCV’s built in implementation of Hough Line Transform.

It is very interesting to see how Hough Line Transform actually works. Assume (as displayed in figure below) we have a line segment AB. In the Cartesian coordinate system, the line can be represented as y = mx + c. Now if we want to represent the same line in polar coordinate system, it can be represented as y=(−cosθ/sinθ)x+(r * sinθ)

Continue reading “Detecting lines in image with OpenCV – Hough Line Transform”

Displaying progress bar when playing video with OpenCV

Displaying progress bar when playing video with OpenCV

progress-bar-openCV

At times when working in OpenCV with Videos, since there is no easy way to tell how long the video is going to be, or if you are planning to post the video on Social media like Instagram that do not show video progress bar, you may want to embed progress bar on the video itself. There is no build in feature on OpenCV, so we shall write a function that will achieve the same.

Continue reading “Displaying progress bar when playing video with OpenCV”

Save Video with OpenCV Python

Save Video with OpenCV Python

OpenCV provides a lot of useful functions to work with images and video files. One such operation is saving video using OpenCV. However, saving video with OpenCV, although is a straightforward operation, but at times it becomes very time consuming to debug because of some silly mistakes. Any time there is any issue, the frames do not get added to video file, and the video thus generated does not work. For example if the extension of video file saved does not match the encoding that we are using, the video file does not get written. Similarly, if there is issue with fps or frame size, the video file does not get written to and hence does not work.

Continue reading “Save Video with OpenCV Python”

Finding difference between multiple images using OpenCV and Python

Finding difference between multiple images using OpenCV and Python

OpenCV is very powerful opensource image processing library. It can be used to carry out various operations on images. Today we are going to explore how we can use OpenCV to find and highlight the differences between 2 images.

This is a very powerful technique and it can have many uses including, security surveillance (finding difference between subsequent frames of security camera), hardware machine monitoring and maintenance forecasting, software testing (taking snapshots of pages and comparing them for change overtime or after new release) etc. Applicability of it is only limited by your imagination.

Continue reading “Finding difference between multiple images using OpenCV and Python”

Video basics with OpenCV

Video basics with OpenCV

Since we will be working a lot with Videos when working on computer vision, it makes sense to understand some video basics.

Somethings like:
1. What are videos
2. How to load and run videos
3. How to find values for fps, video frames, duration of video
4. How to put text on videos
5. How to draw shapes on videos
6. How to start a video on a particular position (time)
We shall need to know more about the videos, but for now we can start with these.

Continue reading “Video basics with OpenCV”