Journal of Software Engineering and Applications

Volume 10, Issue 13 (December 2017)

ISSN Print: 1945-3116   ISSN Online: 1945-3124

Google-based Impact Factor: 1.22  Citations  h5-index & Ranking

Code Clone Detection Method Based on the Combination of Tree-Based and Token-Based Methods

HTML  XML Download Download as PDF (Size: 5570KB)  PP. 891-906  
DOI: 10.4236/jsea.2017.1013051    1,290 Downloads   2,961 Views  Citations

ABSTRACT

This article proposes the high-speed and high-accuracy code clone detection method based on the combination of tree-based and token-based methods. Existence of duplicated program codes, called code clone, is one of the main factors that reduces the quality and maintainability of software. If one code fragment contains faults (bugs) and they are copied and modified to other locations, it is necessary to correct all of them. But it is not easy to find all code clones in large and complex software. Much research efforts have been done for code clone detection. There are mainly two methods for code clone detection. One is token-based and the other is tree-based method. Token-based method is fast and requires less resources. However it cannot detect all kinds of code clones. Tree-based method can detect all kinds of code clones, but it is slow and requires much computing resources. In this paper combination of these two methods was proposed to improve the efficiency and accuracy of detecting code clones. Firstly some candidates of code clones will be extracted by token-based method that is fast and lightweight. Then selected candidates will be checked more precisely by using tree-based method that can find all kinds of code clones. The prototype system was developed. This system accepts source code and tokenizes it in the first step. Then token-based method is applied to this token sequence to find candidates of code clones. After extracting several candidates, selected source codes will be converted into abstract syntax tree (AST) for applying tree-based method. Some sample source codes were used to evaluate the proposed method. This evaluation proved the improvement of efficiency and precision of code clones detecting.

Share and Cite:

Ami, R. and Haga, H. (2017) Code Clone Detection Method Based on the Combination of Tree-Based and Token-Based Methods. Journal of Software Engineering and Applications, 10, 891-906. doi: 10.4236/jsea.2017.1013051.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.