Text Extraction from Book Cover Using MSER

7 Pages Posted: 14 Jun 2019

See all articles by Kushan Mehta

Kushan Mehta

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND

Jay Patel

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND

Nilesh Dubey

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND

Date Written: February 24, 2019

Abstract

Detecting text from natural images is an ongoing field of research. In this paper, we propose a text-extraction and detection algorithm pipeline for obtaining information about a particular book by using computer vision. Features of the book such as its reviews, rating and, the price can be displayed to the end user, thus helping people make an informed decision about the book on which they are going to spend time reading. The text detection algorithm uses edge-enhanced Maximally Stable External Region for identifying the text-blob segments accompanied by various non-text area filtering algorithms to find the bounding boxes. These bounding boxes are then chained together and undergo OCR, performed by the Tesseract engine. The results of the extracted text are further improved by performing post-processing NLP techniques such as domain-based OCR and typo correction. The method proposed in this paper has extended use cases in different areas of text detection from natural images.

Suggested Citation

Mehta, Kushan and Patel, Jay and Dubey, Nilesh, Text Extraction from Book Cover Using MSER (February 24, 2019). Proceedings of International Conference on Sustainable Computing in Science, Technology and Management (SUSCOM), Amity University Rajasthan, Jaipur - India, February 26-28, 2019, Available at SSRN: https://ssrn.com/abstract=3358207 or http://dx.doi.org/10.2139/ssrn.3358207

Kushan Mehta (Contact Author)

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND ( email )

Jay Patel

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND ( email )

Nilesh Dubey

CSE, DEPSTAR,CHARUSAT, CHANGA, ANAND ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
233
Abstract Views
1,325
Rank
327,463
PlumX Metrics