Abstract:
The performance of state-of-the-art i-vector speaker verification systems relies on a large amount of training data for
probabilistic linear discriminant analysis (PLDA) modeling. During the evaluation, it is also crucial that the target condition
data is matched well with the development data used for PLDA training. However, in many practical scenarios,
these systems have to be developed, and trained, using data which is often outside the domain of the intended application,
since the collection of a significant amount of in-domain data is often difficult. Experimental studies have found
that PLDA speaker verification performance degrades significantly due to this development/evaluation mismatch. This
paper introduces a domain-invariant linear discriminant analysis (DI-LDA) technique for out-domain PLDA speaker
verification that compensates domain mismatch in the LDA subspace. We also propose a domain-invariant probabilistic
linear discriminant analysis (DI-PLDA) technique for domain mismatch modeling in the PLDA subspace, using only
a small amount of in-domain data. In addition, we propose the sequential and score-level combination of DI-LDA,
and DI-PLDA to further improve out-domain speaker verification performance. Experimental results show the proposed
domain mismatch compensation techniques yield at least 27% and 14.5% improvement in equal error rate (EER)
over a pooled PLDA system for telephone-telephone and interview-interview conditions, respectively. Finally, we show
that the improvement over the baseline pooled system can be attained even when significantly reducing the number of
in-domain speakers, down to 30 in most of the evaluation conditions.