Using Wikipedia as an External Knowledge Source for Supporting Contextual Disambiguation

Abstract

Every term has a meaning but there are terms which have multiple meanings. Identifying the correct meaning of a term in a specific context is the goal of Word Sense Disambiguation (WSD) applications. Identifying the correct sense of a term given a limited context is even harder. This research aims at solving the problem of identifying the correct sense of a term given only one term as its context. The main focus of this research is on using Wikipedia as the external knowledge source to decipher the true meaning of each term using a single term as the context. We experimented with the semantically rich Wikipedia senses and hyperlinks for context disambiguation. We also analyzed the effect of sense filtering on context extraction and found it quite effective for contextual disambiguation. Results have shown that disambiguation with filtering works quite well on manually disambiguated dataset with the performance accuracy of 86%.

Share and Cite:

S. Jabeen, X. Gao and P. Andreae, "Using Wikipedia as an External Knowledge Source for Supporting Contextual Disambiguation," Journal of Software Engineering and Applications, Vol. 5 No. 12B, 2012, pp. 175-180. doi: 10.4236/jsea.2012.512B034.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] S. Patwardhan, S. Banerjee, and T. Pedersen, “Using measures of semantic relatedness for word sense disam-biguation,” in Proceedings of the 4th International Con-ference on IntelligentText Processing and Computational Linguistics, February 2003, pp. 241–257.
[2] R. Mihalcea, “Using wikipedia for automatic word sense disambiguation,” in North American Chapter of the Association for Computational Linguistics (NAACL 2007), 2007.
[3] D. McCarthy, “Word sense disambiguation: The case for-
[4] combinations of knowledge sources,” Natural Language Engineering, vol. 10, pp. 196–200, June 2004.
[5] S. P. Ponzetto and R. Navigli, “Know-ledge-rich word sense disambiguation rivaling supervised systems,” in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010, pp. 1522–1531.
[6] E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa, “Wikiwalk: random walks on wikipedia for semantic relatedness,” in 2009 Workshop on Graph-based Methods for Natural Language Processing, 2009, pp. 41–49.
[7] M. Strube and S. P. Ponzetto, “Wikirelate! computing semantic relatedness using wikipedia,” in proceedings of the 21st national conference on Artificial intelligence, vol. 2, 2006, pp. 1419–1424.
[8] J. Curtis, J. Cabral, and D. Baxter, “On the application of the cyc ontology to word sense disambiguation,” in Proceedings of the 19th Inter-national Florida Artificial Intelligence Research Society Conference, 2006, pp. 652–657.
[9] J. Cowie, J. Guth-rie, and L. Guthrie, “Lexical disambiguation using simu-lated annealing,” in Proceedings of the workshop on Speech and Natural Language, 1992, pp. 238–242.
[10] M. Lesk, “Automatic sense disambigua-tion using machine readable dictionaries: how to tell a pine cone from an ice cream cone,” in Proceedings of the 5th annual international conference on Systems docu-mentation, 1986, pp. 24–26.
[11] S. Banerjee and T. Pedersen, “An adapted lesk algorithm for word sense disambiguation using wordnet,” In Proceeing of the Third International Conference on Intelligent Text Processing and Computational Linguistics, 2002, pp. 136–145.
[12] T. Pedersen, S. Banerjee, and S. Pat-wardhan, “Maximizing Semantic Relatedness to Perform Word Sense Disambiguation,” University of Minnesota Supercomputing Institute, Research Report UMSI 2005/25, March 2005.
[13] D. Yarowsky, “Word-sense disambiguation using statistical models of roget’s categories trained on large corpora,” in Proceedings of the 14th conference on Computational linguistics - Volume 2, 1992, pp. 454–460.
[14] E. Agirre and G. Rigau, “Word sense disambiguation using conceptual density,” in Proceedings of the 16th conference on Computational linguistics - Volume 1, 1996, pp. 16–22.
[15] J. Veronis and N. M. Ide, “Word sense disambiguation with very large neural networks extracted from machine readable dictionaries,” in Proceedings of the 13th conference on Computational linguistics - Volume 2, 1990, pp. 389–394.
[16] R. Mihalcea, P. Tarau, and E. Figa, “Pagerank on semantic networks, with application to word sense disambiguation,” in Proceedings of the 20th inter-national conference on Computational Linguistics, 2004.
[17] D. Turdakov and P. Velikhov, “Semantic relatedness metric for wikipedia concepts based on link analysis and its application to word sense disambigua-tion,” SYRCoDIS, vol. 355, pp. 1–6, 2008.
[18] A. Fo-garolli, “Word sense disambiguation based on wikipedia link structure,” in Proceedings of the 2009 IEEE Interna-tional Conference on Semantic Computing, 2009, pp. 77–82.
[19] S. Cucerzan, “Large-scale named entity disambiguation based on wikipedia data,” in Proceedings of EMNLP-CoNLL 2007, 2007, pp. 708–716.
[20] B. Razvan and P. Marius, “Using encyclopedic knowledge for named entity disambiguation,” in Proceesings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-06), 2006, pp. 9–16.
[21] S. Jabeen, X. Gao, and P. Andreae, “Im-proving contextual relatedness computation by leverag-ing wikipedia semantics,” in 12th Pacific Rim Interna-tional Conference on Artificial Intelligence (To appear), 2012.
[22] D. Milne and I. H. Witten, “An effective, low-cost measure of semantic relatedness obtained from wikipedia links,” in Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Syn-ergy, 2008, pp. 25–30.
[23] D. Milne, “An open-source toolkit for mining Wikipedia,” in Proceeding of New Zealand Computer Science Research Student Conference, vol. 9, 2009.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.