Abstract:
Automatic forensic voice comparison (FVC) systems employed in forensic casework have often relied on Gaussian
Mixture Model - Universal Background Models (GMM-UBMs) for modelling with relatively little research into
supervector-based approaches. This paper reports on a comparative study which investigates the effectiveness of
multiple approaches operating on GMM mean supervectors, including support vector machines and various forms
of regression. Firstly, we demonstrate a method by which supervector regression can be used to produce a forensic
likelihood ratio. Then, three variants of solving the regression problem are considered, namely least squares and ℓ1
and ℓ2 norm minimization solutions. Comparative analysis of these techniques, combined with four different scoring
methods, reveals that supervector regression can provide a substantial relative improvement in both validity (up to
75.3%) and reliability (up to 41.5%) over both Gaussian Mixture Model - Universal Background Models (GMM-UBMs)
and Gaussian Mixture Model - Support Vector Machine (GMM-SVM) results when evaluated on two studio clean
forensic speech databases. Under mismatched/noisy conditions, more modest relative improvements in both validity
(up to 41.5%) and reliability (up to 12.1%) were obtained relative to GMM-SVM results. From a practical standpoint,
the analysis also demonstrates that supervector regression can be more effective than GMM-UBM or GMM-SVM in
obtaining a higher positive-valued likelihood ratio for same-speaker comparisons, thus improving the strength of
evidence if the particular suspect on trial is indeed the offender. Based on these results, we recommend least
squares as the better performing regression technique with gradient projection as another promising technique
specifically for applications typical of forensic case work.