Abstract:
Group delay is proposed as an effective means of
representing spectral phase information as a feature in speaker
recognition. Robustness of group delay features is difficult to
achieve, since the spiky nature of the group delay masks the fine
structure of the group delay. In this paper, two features based on
group delay are proposed by reducing the effect of spikes with
two different approaches. The first is log compression, to address
the masking effects of the spikes, and the second is to use a subband
based approach, where masking is restricted within certain
bands containing the spikes. The purpose of this paper is to
introduce different types of group delay feature extraction
methods. The two features are evaluated on the cellular NIST
2001 database.