dc.contributor.author |
Jarashanth, S.T. |
|
dc.contributor.author |
Ahilan, K. |
|
dc.contributor.author |
Valluvan, R. |
|
dc.contributor.author |
Thiruvaran, T. |
|
dc.contributor.author |
Kaneswaran, A. |
|
dc.date.accessioned |
2023-02-17T07:08:11Z |
|
dc.date.available |
2023-02-17T07:08:11Z |
|
dc.date.issued |
2022 |
|
dc.identifier.uri |
http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177 |
|
dc.description.abstract |
Speaker diarization is the task of partitioning a
speech signal into homogeneous segments corresponding to
speaker identities. We introduce a Tamil test dataset,
considering that the existing literature on speaker diarization
has experimented with English to a great extent; however, none
on a Tamil dataset. An overlapped speech segment is a part of
an audio clip where two or more speakers speak simultaneously.
Overlapped speech regions degrade the performance of a
speaker diarization system proportionally due to the complexity
of identifying individual speakers. This study proposes an
overlapped speech detection (OSD) model by discarding the
non-speech segments and feeding speech segments into a
Convolutional Recurrent Neural Network model as a binary
classifier: single speaker speech and overlapped speech. The
OSD model is integrated into a speaker diarizer, and the
performance gain on the standard VoxConverse and our Tamil
datasets in terms of Diarization Error Rate are 5.6% and 13.4%,
respectively. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
IEEE |
en_US |
dc.subject |
Overlapped speech detection |
en_US |
dc.subject |
Speaker diarization |
en_US |
dc.subject |
Convolutional recurrent neural network |
en_US |
dc.subject |
Binary classifier |
en_US |
dc.subject |
Tamil dataset |
en_US |
dc.title |
Overlapped Speech Detection for Improved Speaker Diarization on Tamil Dataset |
en_US |
dc.type |
Article |
en_US |