Abstract:
It is known that speaker-specific information is distributed non-uniformly in the frequency domain. Current speaker recognition systems utilize auditory-motivated scales for extracting acoustic features. These scales, however, are not optimised to exploit the spectral distribution of speaker-specific information and hence may not be the optimal choice for speaker recognition. In this paper, the authors studied the distribution of speaker-specific information for Spectral Centroid Frequency feature, and a non¬uniform filter bank is proposed to capture the information effectively for spectral centroid feature. The F-ratio and Kullback-Leibler (KL) distance were used to measure distribution of speaker-specific information and it was empirically shown that the KL distance is better than F-ratio in measuring discriminative ability. The proposed filterbank emphasises the high KL distance regions by allocating more filters in those regions. Experimental results showed a relative EER reduction of 8.8% over the Mel-scale filterbank on NIST2006 SRE database.