Please use this identifier to cite or link to this item:
http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177
Title: | Overlapped Speech Detection for Improved Speaker Diarization on Tamil Dataset |
Authors: | Jarashanth, S.T. Ahilan, K. Valluvan, R. Thiruvaran, T. Kaneswaran, A. |
Keywords: | Overlapped speech detection;Speaker diarization;Convolutional recurrent neural network;Binary classifier;Tamil dataset |
Issue Date: | 2022 |
Publisher: | IEEE |
Abstract: | Speaker diarization is the task of partitioning a speech signal into homogeneous segments corresponding to speaker identities. We introduce a Tamil test dataset, considering that the existing literature on speaker diarization has experimented with English to a great extent; however, none on a Tamil dataset. An overlapped speech segment is a part of an audio clip where two or more speakers speak simultaneously. Overlapped speech regions degrade the performance of a speaker diarization system proportionally due to the complexity of identifying individual speakers. This study proposes an overlapped speech detection (OSD) model by discarding the non-speech segments and feeding speech segments into a Convolutional Recurrent Neural Network model as a binary classifier: single speaker speech and overlapped speech. The OSD model is integrated into a speaker diarizer, and the performance gain on the standard VoxConverse and our Tamil datasets in terms of Diarization Error Rate are 5.6% and 13.4%, respectively. |
URI: | http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/9177 |
Appears in Collections: | Electrical & Electronic Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Overlapped Speech Detection for Improved Speaker.pdf | 248.51 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.