Document Type: Original Research Paper


1 Computer Dept, Technical & Engineering Faculty, Islamic Azad University Sanandaj Branch, Sanandaj, Iran

2 Computer Dept, Technical & Engineering Faculty, Islamic Azad University Mahabad Branch, Mahabad, Iran

3 Computer Dept, Technical & Engineering Faculty, Islamic Azad Unversity Sanandaj Branch, Sanandaj, Iran


For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges being the noise in some situations, which is the main cause of errors in the correct diagnosis of speech. One of the ways for solving this problem is image processing, that in this study, the purpose has been designing and implementing a system for automatic recognition of Persian letters through image-processing techniques. For this purpose, after providing a database for Persian verbal phonetics, we first used image processing techniques to eliminate the presence of noises and detect the cantor in lip, in which we used edge detection to identify the edges of the lip. After finding the upper and lower points of the lip for five frames of each film, we used the mean gap between the upper and lower points of the lip as the characteristic of each phoneme and then by providing a database of these features, with the help of the back propagation artificial neural network and The radial basis function have categorized these phonemes, which ultimately achieved the desired results in the categorization of the phonemes. Of course, the precision of classification using the back propagation artificial neural network has been more than radial basis function ANN.


Main Subjects

[1] Jadczyk, Tomasz. Zi´ołko, Mariusz. (2015). “Audio-Visual Speech Processing System for Polish with Dynamic Bayesian Network Models”. Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS).
[2] Sreekanth, N. S., Narayanan, N. K. (2016). “Enhanced Automatic Speech Recognition with Non-acoustic Parameters”. Proceedings of the International Conference on Signal, Networks, Computing, and Systems pp 93-104.
[3] Thanda, Abhinav. Venkatesan, Shankar. (2016). “Audio Visual Speech Recognition using Deep Recurrent Neural Networks”. Computer Vision and Pattern Recognition (cs.CV).
[4] Garg, Ishu. Verma, Amandeep. (2016). “An improved visual Recognition of letters of English Language Using Lip Reading Technique”. Garg Ishu, Verma Amandeep, International Journal of Advance research, Ideas and Innovations in Technology.
[5] Lalitha. S.D., Thyagharajan K.K. (2016). “A Study on Lip Localization Techniques used for Lip reading from a Video”. International Journal of Applied Engineering Research ISSN 0973- 4562 Volume 11, Number 1, pp 611-615.
[6] Khanlari, Parviz. (2015) Dastur Zaban Farsi (Grammar of Persian language)
[7] Wang, Yi-Qing., (2014). V0.5 IPOL article class An Analysis of the Viola-Jones Face Detection Algorithm. pp. 128-148.
[8] Cheunga, Y., Liua, X., Youb, X., (2012). “A local region based approach to lip trackin”, Contents lists available at SciVerse ScienceDirect.