Srikanth Madikeri

Srikanth Madikeri, Dr.

Academic associate

Raumbezeichnung: AND 2.34

E-Mail

About me

I got my Ph.D. in Computer Science and Engineering from Indian Institute of Technology Madras in 2013. During my Ph.D. I worked on automatic speaker recognition and spoken keyword spotting. I worked as a Postdoctoral Researcher and Research Associate at Idiap in the Speech Processing group from 2013 until May 2024. I worked on low Resource Automatic Speech recognition, automatic speaker recognition (text-independent and text-depenedent), language identification among other speech technologies.

My current research interests include - Automatic Speech Recognition for low resource languages with focus on information extraction, Automatic Speaker Recognition, Language Recognition, Speaker Diarization, and more recently Spoken Dialog systems.

Education

Ph.D. in Computer Science and Engineering at IIT-Madras (2008-2013)
Bachelor of Engineering in Computer Science and Engineering, Anna University, Chennai (2004-2008)

Experience

Lecturer/Senior Researcher at Dept. of Computational Linguistics, Univeristy of Zurich (present)
Research Associate at Idiap Reserach Institute (2018-2024)
Postdoctoral researcher at Idiap Reserach Institute (2013-2018)
3 years as Research Associate at IIT Madras (2010-2013)
2 years as Project Associate at IIT Madras (2008-2010)

Publications

Full list of publications on Google scholar

Journals and Book Chapters

Driss Khalil, Amrutha Prasad, Petr Motlicek, Juan Pablo Zuluaga, Iuliia Nigmatulina, Srikanth Madikeri, Christof Schuepbach, An Automatic Speaker Clustering Pipeline for Air Traffic Communication Domain, to appear in Special Issue on Automatic Speech Recognition and Understanding in Air Traffic Management.
Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlicek, Driss Khalil, Srikanth Madikeri, Allan Tart, Igor Szoke, Vincent Lenders, Mickael Rigault, Khalid Choukri, Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding. Special Issue on Automatic Speech Recognition and Understanding in Air Traffic Management (2023), Aerospace 10.10 pp. 898.
N. Dawalatabad, S. Madikeri, C. C. Sekhar, and H. A. Murthy, "Novel architectures for unsupervised information bottleneck based speaker diarization of meetings", in IEEE Trans. on Audio, Speech, and Language Processing 2021.
I. Himawan, S. Madikeri, P. Motlicek, M. Cernak, S. Sridharan, and C. Fookes, "Voice Presentation Attack Detection Using Convolutional Neural Networks", Handbook of Biometric Anti-Spoofing, pp. 391-415. ( Code)
S. Dey, P. Motlicek, S. Madikeri, M. Ferras, "Template-matching for text-dependent speaker verification", Speech Communication, Vol 88, pp. 96-105.
M. Ferras, S. Madikeri, H. Bourlard, "Speaker Diarization and Linking of Meeting Data", IEEE ACM. Trans. Audio Speech Lang. Processing. 24(11) pp. 1935-1945.
M. Ferras, S. Madikeri, P. Motlicek, S. Dey and H. Bourlard, "A large-scale open-source acoustic simulator for speaker recognition", IEEE Signal Processing Letters, Vol. 23 (4), pp. 527-531. ( Code)
S. Madikeri, "A fast and scalable hybrid FA/PPCA-based framework for speaker recognition", in Digital Signal Processing, Vol. 32, pp. 137-145, September 2014. (Code hosted at IIT-M)
S. Madikeri, A. Talambedu, and H. A. Murthy, "Modified group delay feature based total variability space modelling for speaker recognition", Internation Journal of Speech Technology, Vol. 18(1), pp. 17-23.

Conferences and Workshops

Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia, Aravind Ganapathiraju
, "Token Verse: Unifying Speech and NLP Tasks via Transducer-based ASR", to appear in Proc. of EMNLP
Amrutha Prasad, Srikanth Madikeri, Driss Khalil, Petr Motlicek, Christof Schuepbach, "Speech and Language Recognition with Low-rank Adaptation of Pretrained Models", in Proc. of Interspeech 2024, pp. 2825-2829, doi: 10.21437/Interspeech.2024-2187. pdf
Geoffroy Vanderreydt, Amrutha Prasad, Driss Khalil, Srikanth Madikeri, Kris Demuynck and Petr Motlicek, Parameter-Efficient Training with Adaptive Bottlenecks for Automatic Speech Recognition, to appear in the proceedings of Automatic Speech Recognition and Understanding, 2023.
Iuliia Nigmatulina, Srikanth Madikeri, Esaú Villatoro-Tello, Petr Motli¿ek, Juan Zuluaga-Gomez, Karthik Pandia, Aravind Ganapathiraju, Implementing contextual biasing in GPU decoder for online ASR, in Proc. of Interspeech 2023, pp. 4494--4498.
E. Villatoro, S. Madikeri, P. Motlicek, A. Ganapathiraju, A. Ivanov, "Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings", in Proc. of SIGIR 2022, pp. 2669-2674.
S. Madikeri, P. Motlicek, H. Bourlard, "Multitask adaptation with Lattice-Free MMI for multi-genre speech recognition of low resource languages", in Proc. of Interspeech 2021, pp 4329-4333.
A. Vyas, S. Madikeri, H. Bourlard, "Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model", in Proc. of Interspeech 2021, pp. 2861-2865.
S. Sarfjoo, S. Madikeri, P. Motlicek, "Speech Activity Detection Based on Multilingual Speech Recognition System", in Proc. of Interspeech 2021
A. Vyas, S. Madikeri, H. Bourlard, "Lattice-free mmi adaptation of self-supervised pretrained acoustic models", in Proc. of ICASSP 2021 ()
S. Madikeri, B. Khonglah, S. Tong, Petr Motlicek, H. Bourlard and D. Povey, "Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition System", in Proc. Of Interspeech 2020. (Kaldi recipe)
B. Khonglah, et al., "Incremental Semi-supervised Learning for Multi-Genre Speech Recognition", in Proc. Of IEEE ICASSP 2020.
E. Boschee, et al., "SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage", in Proc. of the 57th Conference of the Association for Computational Linguistics: System Demonstrations, pp. 19-24. (link)
S. Madikeri, S. Dey, P. Motlicek, "A Bayesian Approach to Inter-task fusion for speaker recognition", in Proc. of ICASSP 2019, pp. 5786-5790.
S. Dey, S. Madikeri, and P. Motlicek, "End-to-end text-dependent speaker verification using novel distance measures", in Proc. of Interspeech 2018, pp. 3598-3602.
S. Madikeri, S. Dey, and P. Motlicek, "Analysis of Language Dependent Front-End for Speaker Recognition", in Proc. of Interspeech 2018, pp. 1101-1105.
S. Dey, P. Motlicek, S. Madikeri, and M. Ferras, "Exploiting sequence information for text-dependent speaker verification", in Proc. of ICASSP 2017, pp. 5370-5374.
S. Dey, S. Madikeri, and P. Motlicek, "Information theoretic clustering for unsupervised domain-adaptation", in Proc. of ICASSP 2016, pp. 5580-5584.
M. Ferras, S. Madikeri, P. Motlicek, and H. Bourlard, "System fusion and speaker linking for longitudinal diarization of tv shows", in Proc. of ICASSP 2016, pp. 5495-5499.
S. Dey, S. Madikeri, M. Ferras, and P. Motlicek, "Deep neural network based posteriors for text-dependent speaker verification", in Proc. of ICASSP 2016, pp. 5050-5054.
N. Dawalatabad, S. Madikeri, C. C. Sekhar, and H. A. Murthy, "Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features", in Proc. of Interspeech 2016, pp. 2199-2203.
M. Ferras, S. Madikeri, S. Dey, P. Motlicek, and H. Bourlard, "Inter-Task System Fusion for Speaker Recognition", in Proc. of Interspeech 2016, pp. 1810-1814.
S. Madikeri, and H. Bourlard, "KL-HMM based speaker diarization system for meetings", in Proc. of ICASSP 2015, Brisbane, Australia, pp. 4435-4439.
P. Motlicek, S.Dey, S. Madikeri, and L. Burget, "Employment of Subspace Gaussian Mixture Models in speaker recognition", in Proc. of ICASSP 2015, Brisbane, Australia, pp. 4445-4449.
S. Madikeri, and H. Bourlard ,"Filterbank slope based features for speaker diarization", in Proc. ICASSP 2014, Florence, Italy, pp. 111-115.
S. Madikeri, "A Hybrid Factor Analysis and Probabilistic PCA-based system for Dictionary Learning and Encoding for Robust Speaker Recognition", In Odyssey 2012-The Speaker and Language Recognition Workshop [pdf].
S. Madikeri and H. A. Murthy, "Mel Filter Bank energy-based Slope feature and its application to speaker recognition," Communications (NCC), 2011 National Conference on , vol., no., pp.1-4, 28-30 Jan. 2011 doi: 10.1109/NCC.2011.5734713
S. Madikeri, and H. A. Murthy, "Discriminative training of Gaussian mixture speaker models: A new approach," Communications (NCC), 2010 National Conference on , vol., no., pp.1-5, 29-31 Jan. 2010 doi: 10.1109/NCC.2010.5430204 (Best Paper Award in Signal Processing Track)

Code/Toolkits

Pkwrap: a pytorch wrapper for LF-MMI training in Kaldi arXiv
Multilingual LF-MMI training: sample recipe is available here
Standard i-vector implementation for Kaldi
IB diarization toolkit (in C++)

Professional Activities & Awards

Area Chair for Interspeech (Speaker recognition track) 2021, 2022
Winner of the International Create Challenge 2017
Best paper award at NCC 2011 for the paper titled "Discriminative training of Gaussian mixture speaker models: A new approach" in the Signal Processing Track
Reviewer for IEEE TASLP, Speech communication, Interspeech, ICASSP, ASRU

Teaching

Speech Technology (HS2024)
Machine Learning for Computational Linguistics (HS2024)

Institut für Computerlinguistik

Quicklinks und Sprachwechsel

Hauptnavigation