Visual Speech recognition for Sinhala language using CNN

Show simple item record

dc.contributor.author Jayarathne, W.M.U.
dc.contributor.author Perera, W.A.S.C.
dc.contributor.author Ketheesan, T.
dc.date.accessioned 2022-05-17T06:57:14Z
dc.date.available 2022-05-17T06:57:14Z
dc.date.issued 17-02-21
dc.identifier.issn 1391-8796
dc.identifier.uri http://drr.vau.ac.lk/handle/123456789/107
dc.description.abstract Visual Speech Recognition (VSR) is an essential tool that is facilitating to understand the speech from the video by the visually impaired people. Moreover, VSR play an important role in analyzing the CCTV footage for a crime investigation where the audio is not available. On the other hand, VSR system for Sinhala language still under research not explored largely. Hence in this research, a preliminary research work is carried out to understand the suitability of convolutional neural network (CNN) to recognize the Sinhala character from the image which contain the mouth region. The proposed methodology train the CNN with the help of lip pose features and corresponding character label. The architecture of the CNN employees‟ three convolution layers, two fully connected layers and one max pool layer. There is no data set available publicly for Sinhala language visual speech recognition and for the evaluation of the system, own data set was created for five Sinhala characters that has phonetics sound a, e, i, l, m. The data set was augmented to increase the feature domain and the outliers are removed to overcome the ambiguity. The system was trained with fifteen images and tested with ten images, those are containing the lip pose when pronounce five sounds. For the evaluation purpose the confusion matrix is analyzed and the accuracy was determined by the score. The score is calculated using the precision and recall and found 0.83, it means that the proposed methodology performs well. en_US
dc.language.iso en en_US
dc.publisher Faculty of Science, University of Ruhuna en_US
dc.subject CNN en_US
dc.subject Sinhala en_US
dc.subject Character en_US
dc.subject Visual en_US
dc.title Visual Speech recognition for Sinhala language using CNN en_US
dc.type Conference paper en_US
dc.identifier.proceedings 8th Ruhuna International Science & Technology Conference (RISTCON 2021) en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Browse

My Account