Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System
Conference Proceeding
Gogate, M., Dashtipour, K., & Hussain, A. (2020)
Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System. In Proc. Interspeech 2020 (4521-4525). https://doi.org/10.21437/interspeech.2020-2935
In this paper, we present VIsual Speech In real nOisy eNvironments (VISION), a first of its kind audio-visual (AV) corpus comprising 2500 utterances from 209 speakers, recorde...
Deep Neural Network Driven Binaural Audio Visual Speech Separation
Conference Proceeding
Gogate, M., Dashtipour, K., Bell, P., & Hussain, A. (2020)
Deep Neural Network Driven Binaural Audio Visual Speech Separation. In 2020 International Joint Conference on Neural Networks (IJCNN). https://doi.org/10.1109/ijcnn48605.2020.9207517
The central auditory pathway exploits the auditory signals and visual information sent by both ears and eyes to segregate speech from multiple competing noise sources and help...
Robust Visual Saliency Optimization Based on Bidirectional Markov Chains
Journal Article
Jiang, F., Kong, B., Li, J., Dashtipour, K., & Gogate, M. (2021)
Robust Visual Saliency Optimization Based on Bidirectional Markov Chains. Cognitive Computation, 13, 69–80. https://doi.org/10.1007/s12559-020-09724-6
Saliency detection aims to automatically highlight the most important area in an image. Traditional saliency detection methods based on absorbing Markov chain only take into a...
CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement
Journal Article
Gogate, M., Dashtipour, K., Adeel, A., & Hussain, A. (2020)
CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement. Information Fusion, 63, 273-285. https://doi.org/10.1016/j.inffus.2020.04.001
Noisy situations cause huge problems for the hearing-impaired, as hearing aids often make speech more audible but do not always restore intelligibility. In noisy settings, hum...
Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances
Conference Proceeding
Ahmed, R., Dashtipour, K., Gogate, M., Raza, A., Zhang, R., Huang, K., …Hussain, A. (2020)
Offline Arabic Handwriting Recognition Using Deep Machine Learning: A Review of Recent Advances. In Advances in Brain Inspired Cognitive Systems: 10th International Conference, BICS 2019, Guangzhou, China, July 13–14, 2019, Proceedings (457-468). https://doi.org/10.1007/978-3-030-39431-8_44
In pattern recognition, automatic handwriting recognition (AHWR) is an area of research that has developed rapidly in the last few years. It can play a significant role in bro...
Random Features and Random Neurons for Brain-Inspired Big Data Analytics
Conference Proceeding
Gogate, M., Hussain, A., & Huang, K. (2020)
Random Features and Random Neurons for Brain-Inspired Big Data Analytics. In 2019 International Conference on Data Mining Workshops (ICDMW). https://doi.org/10.1109/icdmw.2019.00080
With the explosion of Big Data, fast and frugal reasoning algorithms are increasingly needed to keep up with the size and the pace of user-generated contents on the Web. In ma...
A hybrid Persian sentiment analysis framework: Integrating dependency grammar based rules and deep neural networks
Journal Article
Dashtipour, K., Gogate, M., Li, J., Jiang, F., Kong, B., & Hussain, A. (2020)
A hybrid Persian sentiment analysis framework: Integrating dependency grammar based rules and deep neural networks. Neurocomputing, 380, 1-10. https://doi.org/10.1016/j.neucom.2019.10.009
Social media hold valuable, vast and unstructured information on public opinion that can be utilized to improve products and services. The automatic analysis of such data, how...
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Journal Article
Adeel, A., Gogate, M., Hussain, A., & Whitmer, W. M. (2021)
Lip-Reading Driven Deep Learning Approach for Speech Enhancement. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(3), 481-490. https://doi.org/10.1109/tetci.2019.2917039
This paper proposes a novel lip-reading driven deep learning framework for speech enhancement. The approach leverages the complementary strengths of both deep learning and ana...
Contextual deep learning-based audio-visual switching for speech enhancement in real-world environments
Journal Article
Adeel, A., Gogate, M., & Hussain, A. (2020)
Contextual deep learning-based audio-visual switching for speech enhancement in real-world environments. Information Fusion, 59, 163-170. https://doi.org/10.1016/j.inffus.2019.08.008
Human speech processing is inherently multi-modal, where visual cues (e.g. lip movements) can help better understand speech in noise. Our recent work [1] has shown that lip-re...
Deep Cognitive Neural Network (DCNN)
Patent
Howard, N., Adeel, A., Gogate, M., & Hussain, A. (2019)
Deep Cognitive Neural Network (DCNN). US2019/0156189
Embodiments of the present systems and methods may provide a more efficient and low-powered cognitive computational platform utilizing a deep cognitive neural network (DCNN), ...