Abstract:
Name boards are the most popular visual aids on roadways, and they help with location identification. Damage to name boards can deface them in various cases. When name boards are severely damaged, it can be difficult for a visitor to recognize them. This research aims to identify, locate, and recognize characters in name boards using various font styles and partial letters. Due to the lack of a dataset of name board images with partially visible letters, images of name boards in three languages: Tamil, English, and Sinhala, are collected around Jaffna. Only the English language displayed in name boards is used to predict partial and whole characters in this work. First, the image is preprocessed with grayscale transformation and thresholding, followed by morphological transformations to localize the text sections. A connected component analysis is used to scan a binarized image and classify its pixels into components based on their pixel connection. Each pixel is assigned a value in identified groups of pixels based on its
assigned component. The text lines are then segmented using the Skeleton analysis method in the next phase. Then, the segmented character image is turned into strings using the coefficient of correlation and structural similarity index methods. Character recognition is also built and trained using upper and lower case letter character visual
data samples. As a result, characters with disfigurements were identified using the most similar character image. As some proposed strategies, such as the automated text detection algorithm and the maximally stable extremal regions feature detector, failed to predict missing characters in the name boards, this proposed method achieved results with an accuracy of 81.7%.