Please use this identifier to cite or link to this item: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177
Title: Overlapped Speech Detection for Improved Speaker Diarization on Tamil Dataset
Authors: Jarashanth, S.T.
Ahilan, K.
Valluvan, R.
Thiruvaran, T.
Kaneswaran, A.
Keywords: Overlapped speech detection;Speaker diarization;Convolutional recurrent neural network;Binary classifier;Tamil dataset
Issue Date: 2022
Publisher: IEEE
Abstract: Speaker diarization is the task of partitioning a speech signal into homogeneous segments corresponding to speaker identities. We introduce a Tamil test dataset, considering that the existing literature on speaker diarization has experimented with English to a great extent; however, none on a Tamil dataset. An overlapped speech segment is a part of an audio clip where two or more speakers speak simultaneously. Overlapped speech regions degrade the performance of a speaker diarization system proportionally due to the complexity of identifying individual speakers. This study proposes an overlapped speech detection (OSD) model by discarding the non-speech segments and feeding speech segments into a Convolutional Recurrent Neural Network model as a binary classifier: single speaker speech and overlapped speech. The OSD model is integrated into a speaker diarizer, and the performance gain on the standard VoxConverse and our Tamil datasets in terms of Diarization Error Rate are 5.6% and 13.4%, respectively.
URI: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177
Appears in Collections:Electrical & Electronic Engineering

Files in This Item:
File Description SizeFormat 
Overlapped Speech Detection for Improved Speaker.pdf248.51 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.