Abstract:
In this paper we introduce a novel domain-invariant covariance
normalization (DICN) technique to relocate both in-domain and
out-domain i-vectors into a third dataset-invariant space, providing
an improvement for out-domain PLDA speaker verification
with a very small number of unlabelled in-domain adaptation
i-vectors. By capturing the dataset variance from a global
mean using both development out-domain i-vectors and limited
unlabelled in-domain i-vectors, we could obtain domaininvariant
representations of PLDA training data. The DICNcompensated
out-domain PLDA system is shown to perform as
well as in-domain PLDA training with as few as 500 unlabelled
in-domain i-vectors for NIST-2010 SRE and 2000 unlabelled
in-domain i-vectors for NIST-2008 SRE, and considerable relative
improvement over both out-domain and in-domain PLDA
development if more are available.