Improving PLDA Speaker Verification Performance using Domain Mismatch Compensation Techniques

Rahman, M.H.; Ahilan, K.; Himawan, I.; Dean, D.; Sridharan, S.

Please use this identifier to cite or link to this item: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/1901

Title:	Improving PLDA Speaker Verification Performance using Domain Mismatch Compensation Techniques
Authors:	Rahman, M.H. Ahilan, K. Himawan, I. Dean, D. Sridharan, S.
Keywords:	Speaker verification;I-vector
Issue Date:	2018
Citation:	Rahman, M. H., Kanagasundaram, A., Himawan, I., Dean, D., & Sridharan, S. (2018). Improving PLDA speaker verification performance using domain mismatch compensation techniques. Computer Speech & Language, 47, 240-258.
Abstract:	The performance of state-of-the-art i-vector speaker verification systems relies on a large amount of training data for probabilistic linear discriminant analysis (PLDA) modeling. During the evaluation, it is also crucial that the target condition data is matched well with the development data used for PLDA training. However, in many practical scenarios, these systems have to be developed, and trained, using data which is often outside the domain of the intended application, since the collection of a significant amount of in-domain data is often difficult. Experimental studies have found that PLDA speaker verification performance degrades significantly due to this development/evaluation mismatch. This paper introduces a domain-invariant linear discriminant analysis (DI-LDA) technique for out-domain PLDA speaker verification that compensates domain mismatch in the LDA subspace. We also propose a domain-invariant probabilistic linear discriminant analysis (DI-PLDA) technique for domain mismatch modeling in the PLDA subspace, using only a small amount of in-domain data. In addition, we propose the sequential and score-level combination of DI-LDA, and DI-PLDA to further improve out-domain speaker verification performance. Experimental results show the proposed domain mismatch compensation techniques yield at least 27% and 14.5% improvement in equal error rate (EER) over a pooled PLDA system for telephone-telephone and interview-interview conditions, respectively. Finally, we show that the improvement over the baseline pooled system can be attained even when significantly reducing the number of in-domain speakers, down to 30 in most of the evaluation conditions.
URI:	http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/1901
Appears in Collections:	Electrical & Electronic Engineering

Files in This Item:

File	Description	Size	Format
Improving PLDA Speaker Verification Performance using.pdf		123 kB	Adobe PDF	View/Open

Show full item record