Semantic Sentence Similarity Using Finite State Machine

Chiranjibi Sitaula; Yadav Raj Ojha

doi:10.4236/iim.2013.56018

Intelligent Information Management > Vol.5 No.6, November 2013

Semantic Sentence Similarity Using Finite State Machine

Chiranjibi Sitaula, Yadav Raj Ojha
Central Department of Computer Science and Information Technology, Tribhuvan University, Kathmandu, Nepal.
Nepal KC Consultancy, Software Company, Kathmandu, Nepal.
DOI: 10.4236/iim.2013.56018 PDF HTML 3,870 Downloads 6,670 Views Citations

Abstract

In this paper, a finite state machine approach is followed in order to find the semantic similarity of two sentences. The approach exploits the concept of bi-directional logic along with a semantic ordering approach. The core part of this approach is bi-directional logic of artificial intelligence. The bi-directional logic is implemented using Finite State Machine algorithm with slight modification. For finding the semantic similarity, keyword has played climactic importance. With the help of the keyword approach, it can be found easily at the sentence level according to this algorithm. The algorithm is proposed especially for Nepali texts. With the polarity of the individual keywords, the finite state machine is made and its final state determines its polarity. If two sentences are negatively polarized, they are said to be coherent, otherwise not. Similarly, if two sentences are of a positive nature, they are said to be coherence. For measuring the coherence (similarity), contextual concept is taken into consideration. The semantic approach, in this research, is a totally contextual based method. Two sentences are said to be semantically similar if they bear the same context. The total accuracy obtained in this algorithm is 90.16%.

Keywords

Artificial Intelligence; Natural Language Processing; Text Mining; Semantic Similarity; Finite State Machine

Share and Cite:

C. Sitaula and Y. Ojha, "Semantic Sentence Similarity Using Finite State Machine," Intelligent Information Management, Vol. 5 No. 6, 2013, pp. 171-174. doi: 10.4236/iim.2013.56018.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Y. H. Li, et al., “Sentence Similarity Based on Semantic Nets and Corpus Statistics,” IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 8, 2006, pp. 1138-1150. http://dx.doi.org/10.1109/TKDE.2006.130
[2]	Y. T. Liu and Y. J. Liang, “A Sentence Semantic Similarity Calculating Method Based on Segmented Semantic-Comparision,” Journal of Theoretical and Applied Information Technology, Vol. 48, No. 1, 2013, pp. 231-235.
[3]	M. Q. Wang and D. Cer, “Stanford: Probabilistic Edit Distance Metrics for STS,” Unpublished.
[4]	M. Mehdi and S. M. Fakhrahmad, “Effective Estimation of Context Similarity: A Proposed Matching Model Based on Weighted Semantic Load,” International Journal of Artificial Intelligence & Applications, Vol. 3, No. 3, 2012, pp 1-10.
[5]	O. Popescu, “Learning Corpus Patterns Using Finite State Automata,” FBK-irst, Trento, 2013.
[6]	L. Li, et al., “Measuring Sentence Similarity from Different Aspects,” Proceeding of the 8th International Conference on Machine Learning and Cybernetics, Baoding, 12-15 July 2009, pp. 2244-2248.
[7]	D. Higgins and J. Burstein, “Sentence Similarity Measures for Essay Coherence,” Proceedings of the 7th International Workshop on Computational Semantics (IWCS), Tilburg, 2007, pp. 1-12.
[8]	D. Vilarino, et al., “BUAP: Lexical and Semantic Similarity for Cross-Lingual Textual Entailment,” Proceedings of the 1st Joint Conference on Lexical and Computational Semantics, Montreal, 7-8 June 2012, pp. 706-709.
[9]	T. N. Dao and T. Simpson, “Measuring Similarity between Sentences,” Unpublished.
[10]	B. K. Bal, “Structure of Nepali Grammar,” PAN Localization, Madan Puraskar Pustakalaya, Kathmandu, Nepal, 2004, pp. 332-396.
[11]	P. Rupakheti, L. P. Khatiwada and B. K. Bal, “Report on Nepali Computational Grammar,” Unpublished, pp. 1-25.
[12]	A. Chakrabarty, B. Purkayastha and A. Roy, “Experiences in Building the Nepali Wordnet-Insights and Challenges,” The 5th Global Wordnet Conference at CFILT, IIT Bombay, Mumbai.
[13]	I. Beltagy, et al., “Montague Meets Markov: Deep Semantics with Probabilistic Logical Form,” 2nd Joint Conference on Lexical and Computational Semantics: Proceeding of the Main Conference and the Shared Task, Atlanta, 13-14 June 2013, pp. 11-21.
[14]	S. Ferilli, et al., “Plugging Taxonomic Similarity in First-Order Logic Horn Clauses Comparison,” AI*IA 2009: Emergent Perspectives in Artificial Intelligence, Lecture Notes in Computer Science, Vol. 5883, 2009, pp. 131-140. http://dx.doi.org/10.1007/978-3-642-10291-2_14
[15]	D. K. Lin, “An Information-Theoretic Definition of Similarity,” ICML, Vol. 98, 1998, pp. 296-304.
[16]	C. Sitaula, “Semantic Text Clustering Using Enhanced Vector Space Model using Nepali Language,” GESJ, Vol. 36, No. 4, 2012, pp. 41-46.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies