The objective of this project is to develop technology to enable the development of applications and open up opportunities that use Indian language OCRs.
Key Objectives of the Project:
-
For printed OCR:
Develop robust recognizers that can recognize printed text in scanned documents meeting the accuracy goals for
the following 5 languages: Sanskrit, Marathi, Tamil, Telugu, Kannada, Malayalam.
For Handwritten OCR: Develop robust recognizers that can recognize handwritten text (offline) meeting the accuracy goals for the languages: Sanskrit, Hindi, Marathi, Tamil, Telugu, Kannada, Malayalam. - Develop an application that demonstrates the utility and performances of the OCR for the above mentioned languages. This will leverage and be built around OpenOCRCorrect.
- An application that can be used by Healthcare institutions for digitising their printed patient medical records. This will help the ongoing mission of Ayushman Bharat Digital Mission.
OCR for Indian languages is quite challenging due to the richness in inflexions such as differences in scripts, various combinations of conjunct characters for similar sounding words, etc., Even after a good accuracy in OCR, the detected text will need a lot of improvement. Further, in the digitization process of such texts, the second step would be spelling and error correction. Hence, the end goal is to convert the generated OCR text in accordance with the scanned images for printed documents. We will use State-of-the-Art (SOTA) Deep Neural Network (DNN) object detection models such as Faster-RCNN, RetiaNet to train and detect the layout. And With human-in-the-loop, we will have tools and methods for automatically and for users to manually correct the erroneous OCR characters.

Example: OpenOCRCorrect tool
Team:
- Prof Ganesh Ramakrishnan, Principal Investigator
- Prof Parag Chaudhuri, Principal Investigator
- Dr Venkatapathy Subramanian, Principal Scientist
Students:
We're Hiring! Check HEREResources:
Presentation about the Project
Contact:
For any queries about the project:
- Mail to: venkat.s.iyer@gmail.com