DSpace Repository

Investigating deep neural networks for speaker diarization in the Dihard challenge

Show simple item record

dc.contributor.author Himawan, I.
dc.contributor.author Rahman, M.H.
dc.contributor.author Sridharan, S.
dc.contributor.author Fookes, C.
dc.contributor.author Ahilan, K.
dc.date.accessioned 2021-03-15T08:04:21Z
dc.date.accessioned 2022-06-27T10:02:26Z
dc.date.available 2021-03-15T08:04:21Z
dc.date.available 2022-06-27T10:02:26Z
dc.date.issued 2018
dc.identifier.citation Himawan, I., Rahman, M. H., Sridharan, S., Fookes, C., & Kanagasundaram, A. (2018, December). Investigating deep neural networks for speaker diarization in the dihard challenge. In 2018 IEEE Spoken Language Technology Workshop (SLT) (pp. 1029-1035). IEEE. en_US
dc.identifier.uri http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/1888
dc.description.abstract We investigate the use of deep neural networks (DNNs) for the speaker diarization task to improve performance under domain mismatched conditions. Three unsupervised domain adaptation techniques, namely inter-dataset variability compensation (IDVC), domain-invariant covariance normalization (DICN), and domain mismatch modeling (DMM), are applied on DNN based speaker embeddings to compensate for the mismatch in the embedding subspace. We present results conducted on the DIHARD data, which was released for the 2018 diarization challenge. Collected from a diverse set of domains, this data provides very challenging domain mismatched conditions for the diarization task. Our results provide insights into how the performance of our proposed system could be further improved. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.subject DIHARD challenge en_US
dc.subject speaker diarization en_US
dc.title Investigating deep neural networks for speaker diarization in the Dihard challenge en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record