DSpace Repository

Overlapped Speech Detection for Improved Speaker Diarization on Tamil Dataset

Show simple item record

dc.contributor.author Jarashanth, S.T.
dc.contributor.author Ahilan, K.
dc.contributor.author Valluvan, R.
dc.contributor.author Thiruvaran, T.
dc.contributor.author Kaneswaran, A.
dc.date.accessioned 2023-02-17T07:08:11Z
dc.date.available 2023-02-17T07:08:11Z
dc.date.issued 2022
dc.identifier.uri http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177
dc.description.abstract Speaker diarization is the task of partitioning a speech signal into homogeneous segments corresponding to speaker identities. We introduce a Tamil test dataset, considering that the existing literature on speaker diarization has experimented with English to a great extent; however, none on a Tamil dataset. An overlapped speech segment is a part of an audio clip where two or more speakers speak simultaneously. Overlapped speech regions degrade the performance of a speaker diarization system proportionally due to the complexity of identifying individual speakers. This study proposes an overlapped speech detection (OSD) model by discarding the non-speech segments and feeding speech segments into a Convolutional Recurrent Neural Network model as a binary classifier: single speaker speech and overlapped speech. The OSD model is integrated into a speaker diarizer, and the performance gain on the standard VoxConverse and our Tamil datasets in terms of Diarization Error Rate are 5.6% and 13.4%, respectively. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.subject Overlapped speech detection en_US
dc.subject Speaker diarization en_US
dc.subject Convolutional recurrent neural network en_US
dc.subject Binary classifier en_US
dc.subject Tamil dataset en_US
dc.title Overlapped Speech Detection for Improved Speaker Diarization on Tamil Dataset en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record