Advanced Search
Volume 32 Issue 8
Sep.  2010
Turn off MathJax
Article Contents
Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072
Citation: Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072

A Two-step Criterion Algorithm of Speaker Segmentation

doi: 10.3724/SP.J.1146.2009.01072 cstr: 32379.14.SP.J.1146.2009.01072
  • Received Date: 2009-08-10
  • Rev Recd Date: 2009-12-01
  • Publish Date: 2010-08-19
  • To improve the precision of Speaker Segmentation (SS), this paper propose a two-step SS algorithm by making use of silence and gender information. Two-step criterion is used to decide the Speaker Change Point (SCP) within detected speech segmentations. In the first step, pitch difference between different speakers and gender model are used to locate the SCP within neighboring speech segments; In the second step, a gender-based modified T2 criterion formula is used to locate SCP among the same gender speakers, and potential speaker change point is detected based on chunk. The experiment results show that the proposed algorithm improved SS precision and F1 can reach 85.14%. For SS with duration less than 2 s, the algorithm can reduce missed detection rate of about 16%, compared with Bayesian information Criterion.
  • loading
  • Sinha R, Tranter S E, Gales M J F, and Woodland P C. Thecambridge university March 2005 speaker diarisation system.In proceeding of the European Conference SpeechCommunication and Technology. Lisbon, Portugal, 2005:2437-2440.[2]Kotti M, Benetos E, and Kotropoulos C. Computationallyefficient and robust BIC-Based speaker segmentation [J].IEEE Transactions on Speech and Audio Processing.2008,16(5):920-933[3]Chen S and Gopalakrishnan P S. Speaker, environment andchannel change detection and clustering via the Bayesianinformation criterion. Proc. DARPA Broadcast NewsTranscription and Understanding Workshop, Lansdowne, VA,Feb. 1998: 127-132.[4]El-Khoury E, Senac C, and Pinquier J. Improved speakerdiarization system for meetings. In ICASSP2009, Taipei,April, 2009: 4097-4100.[5]Christoph Boehm and Franz pernkopf. Effective metric-basedspeaker segmentation in the frequency domain. InICASSP2009, Taipei, April 2009: 4081-4084.[6]Kwon S and Narayanan S. Unsupervised speaker indexingusing generic models [J].IEEE Transactions on Speech andAudio Processing.2005, 13(5):1004-1013[7]郑铁然, 李海峰等. 基于预分割的说话人分割方法. 通信学报,2009, 30(2): 118-123.Zheng Tie-ran and Li Hai-feng, et al.. Method of speakerssegmentation based on pre-segmentation. Journal ofCommuncation, 2009, 30(2): 118-123.[8]Zhou B and Hansen H L. Efficient audio stream segmentationvia the combined T2-statistics and Bayesian informationcriterion [J].IEEE Transactions on Speech and AudioProcessing.2005, 13(4):467-474[9]Lu Lie, Zhang Hong-jiang, and Jiang Hao. Content analysisfor audio classification and segmentation [J].IEEETransactions on Speech and Audio Processing.2002, 10(7):504-516[10]Kotti M, Moschou V, and Kotropoulos C. Speakersegmentation and clustering [J].Journal of Signal Processing.2008, 88(5):1091-1124[11]Boersma P and Weenink D. Paraat: Doing phonetics bycomputer. Available: http:/ /www. praat.org/
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3892) PDF downloads(698) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return