A Two-step Criterion Algorithm of Speaker Segmentation

Yang Ji-Chen; He Qian-Hua; Li Yan-Xiong; Wang  Wei-Ning

doi:10.3724/SP.J.1146.2009.01072

Volume 32 Issue 8

Sep. 2010

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2010 > 32(8): 2006-2009

Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072

Citation:

Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072

Citation:

PDF( 199 KB)

A Two-step Criterion Algorithm of Speaker Segmentation

doi: 10.3724/SP.J.1146.2009.01072 cstr: 32379.14.SP.J.1146.2009.01072

Received Date: 2009-08-10
Rev Recd Date: 2009-12-01
Publish Date: 2010-08-19

Abstract

Abstract

To improve the precision of Speaker Segmentation (SS), this paper propose a two-step SS algorithm by making use of silence and gender information. Two-step criterion is used to decide the Speaker Change Point (SCP) within detected speech segmentations. In the first step, pitch difference between different speakers and gender model are used to locate the SCP within neighboring speech segments; In the second step, a gender-based modified T2 criterion formula is used to locate SCP among the same gender speakers, and potential speaker change point is detected based on chunk. The experiment results show that the proposed algorithm improved SS precision and F1 can reach 85.14%. For SS with duration less than 2 s, the algorithm can reduce missed detection rate of about 16%, compared with Bayesian information Criterion.
- Speech signal processing,
- Two-step criterion,
- Speaker Segmentation (SS),
- Pitch information,
- Gender information

FullText(HTML)

References(1)

References

Sinha R, Tranter S E, Gales M J F, and Woodland P C. Thecambridge university March 2005 speaker diarisation system.In proceeding of the European Conference SpeechCommunication and Technology. Lisbon, Portugal, 2005:2437-2440.[2]Kotti M, Benetos E, and Kotropoulos C. Computationallyefficient and robust BIC-Based speaker segmentation [J].IEEE Transactions on Speech and Audio Processing.2008,16(5):920-933[3]Chen S and Gopalakrishnan P S. Speaker, environment andchannel change detection and clustering via the Bayesianinformation criterion. Proc. DARPA Broadcast NewsTranscription and Understanding Workshop, Lansdowne, VA,Feb. 1998: 127-132.[4]El-Khoury E, Senac C, and Pinquier J. Improved speakerdiarization system for meetings. In ICASSP2009, Taipei,April, 2009: 4097-4100.[5]Christoph Boehm and Franz pernkopf. Effective metric-basedspeaker segmentation in the frequency domain. InICASSP2009, Taipei, April 2009: 4081-4084.[6]Kwon S and Narayanan S. Unsupervised speaker indexingusing generic models [J].IEEE Transactions on Speech andAudio Processing.2005, 13(5):1004-1013[7]郑铁然, 李海峰等. 基于预分割的说话人分割方法. 通信学报,2009, 30(2): 118-123.Zheng Tie-ran and Li Hai-feng, et al.. Method of speakerssegmentation based on pre-segmentation. Journal ofCommuncation, 2009, 30(2): 118-123.[8]Zhou B and Hansen H L. Efficient audio stream segmentationvia the combined T2-statistics and Bayesian informationcriterion [J].IEEE Transactions on Speech and AudioProcessing.2005, 13(4):467-474[9]Lu Lie, Zhang Hong-jiang, and Jiang Hao. Content analysisfor audio classification and segmentation [J].IEEETransactions on Speech and Audio Processing.2002, 10(7):504-516[10]Kotti M, Moschou V, and Kotropoulos C. Speakersegmentation and clustering [J].Journal of Signal Processing.2008, 88(5):1091-1124[11]Boersma P and Weenink D. Paraat: Doing phonetics bycomputer. Available: http:/ /www. praat.org/

Relative Articles

Supplements(0)

Cited By

Proportional views