Visual Speech recognition for Sinhala language using CNN

Jayarathne, W.M.U.; Perera, W.A.S.C.; Ketheesan, T.

Home
→
Faculty of Technological Studies
→
Research Papers
→
Department of Information and Communication Technology
→
View Item

dc.contributor.author	Jayarathne, W.M.U.
dc.contributor.author	Perera, W.A.S.C.
dc.contributor.author	Ketheesan, T.
dc.date.accessioned	2022-05-17T06:57:14Z
dc.date.available	2022-05-17T06:57:14Z
dc.date.issued	17-02-21
dc.identifier.issn	1391-8796
dc.identifier.uri	http://drr.vau.ac.lk/handle/123456789/107
dc.description.abstract	Visual Speech Recognition (VSR) is an essential tool that is facilitating to understand the speech from the video by the visually impaired people. Moreover, VSR play an important role in analyzing the CCTV footage for a crime investigation where the audio is not available. On the other hand, VSR system for Sinhala language still under research not explored largely. Hence in this research, a preliminary research work is carried out to understand the suitability of convolutional neural network (CNN) to recognize the Sinhala character from the image which contain the mouth region. The proposed methodology train the CNN with the help of lip pose features and corresponding character label. The architecture of the CNN employees‟ three convolution layers, two fully connected layers and one max pool layer. There is no data set available publicly for Sinhala language visual speech recognition and for the evaluation of the system, own data set was created for five Sinhala characters that has phonetics sound a, e, i, l, m. The data set was augmented to increase the feature domain and the outliers are removed to overcome the ambiguity. The system was trained with fifteen images and tested with ten images, those are containing the lip pose when pronounce five sounds. For the evaluation purpose the confusion matrix is analyzed and the accuracy was determined by the score. The score is calculated using the precision and recall and found 0.83, it means that the proposed methodology performs well.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Science, University of Ruhuna	en_US
dc.subject	CNN	en_US
dc.subject	Sinhala	en_US
dc.subject	Character	en_US
dc.subject	Visual	en_US
dc.title	Visual Speech recognition for Sinhala language using CNN	en_US
dc.type	Conference abstract	en_US
dc.identifier.proceedings	8th Ruhuna International Science & Technology Conference (RISTCON 2021)	en_US