Advanced Search
Volume 30 Issue 9
Jan.  2011
Turn off MathJax
Article Contents
Wang Xiao-Dong, Guo Lei, Fang Jun, Dong Shu-Fu. An EMD-Based Metric for Document Semantic Similarity[J]. Journal of Electronics & Information Technology, 2008, 30(9): 2156-2161. doi: 10.3724/SP.J.1146.2007.00177
Citation: Wang Xiao-Dong, Guo Lei, Fang Jun, Dong Shu-Fu. An EMD-Based Metric for Document Semantic Similarity[J]. Journal of Electronics & Information Technology, 2008, 30(9): 2156-2161. doi: 10.3724/SP.J.1146.2007.00177

An EMD-Based Metric for Document Semantic Similarity

doi: 10.3724/SP.J.1146.2007.00177 cstr: 32379.14.SP.J.1146.2007.00177
  • Received Date: 2007-01-29
  • Rev Recd Date: 2007-06-11
  • Publish Date: 2008-09-19
  • Aiming at the conflicts between EMD(Earth Movers Distance)-based measure for document semantic similarity and metric axioms, which prevent EMD from being widely applied in the information retrieval and data mining, a novel EMD-based metric for document semantic similarity named Mdss_EMD is presented. Firstly, based on the analysis of drawbacks of EMD and its existing modifications, the concepts of document width and virtual term are proposed. Subsequently, by adding virtual term to initial document vector, the approach aligns the total weights of document vectors, so that all of metric axioms are satisfied. Finally, in order to improve the applicability and processing speed of the metric, the similarity distance of virtual term is designed to be elastic and EMD algorithm is also simplified. The proposed approach extends EMD to metric space, and substantially improves EMD on indexing and accuracy. The experimental results demonstrate that Mdss_EMD outperforms the original EMD and other similar measures in general.
  • loading
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3656) PDF downloads(1242) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return