Reverse Engineering Tool Based on Unified Mapping Method ( RETUM ) : Class Diagram Visualizations

In this research paper, we evaluate an assortment of tools and intend to investigate multifarious characteristic of Imagix-4D Reverse Engineering Tool and on the basis of investigation find out inadequacy of Imagix-4D Reverse Engineering Tool (illustrate only abstract Class Diagram, and it has no support to illustrate ER-Diagram and Sequence Diagram) and propose a Reverse Engineering Tool based on Unified Mapping Method (RETUM) for prominence of Class Diagram Visualizations which surmount the limitation (class diagram which is intricate in visualization) of Imagix4D Reverse Engineering Tool.


Introduction
Understanding the intricate relationships that exist between the source code components of a software system can be an arduous task.In the preceding years, several tools [1] have emerged to support program understanding, software maintenance, reverse engineering, and reverse engineering activities.A large part of such tools extract their information mainly from the source code via static analysis.This includes a set of operations ranging from code parsing and fact extraction, fact aggregation and querying, up to interactive visualization.Many requirements were met through previous Reverse Engineering Tool that was accepted by software industry, for designing purpose.Reverse Engineering Tool (Figure 1) is important for Lexical analyzer or scanner, its function to read the source program in the form of character stream and also grouping the logically related characters together that are known as lexemes.Syntax analysis: parser uses the token_name taken from the token stream to generate the output in the form of a tree-like structure known as syntax tree or parse tree and semantic analysis: semantic analyzer uses the parse tree and symbol table for checking the semantic consistency of the language definition of the source program.The main function of the semantic analysis is type checking in which semantic analyzer checks whether the operator has the operands of matching type.Next phase is intermediate code generation phase: in intermediate code generation phase, the parse tree representation of the source code is converted into low-level or machine-like intermediate representation.Next phase is symbol table which is a data structure used by the compiler to record and collect information about source program constructs like variable names and all of its attributes, which provide information about the storage space occupied by a variable (name, type, and scope of the variables).A symbol table should be designed in an efficient way so that it permits the compiler to locate the record for each token name quickly and to allow rapid transfer of data from the records.Next phase is Error handler: Error handler is invoked whenever any fault occurs in the compilation process of source program.Both the symbol table management and Error handling mechanisms are associated with all phases of the compiler.
When assessing the superiority and maintainability of large C, C++ and Java source code bases, tools are needed for extracting several facts [2] from the source code, such as: Language Support, Pre-Processing, Lexical Analysis, Parsing, Repository and Extracting capabilities of tools.
In this paper, we present our experience in the architecting of Imagix-4D that is a source code analysis tool from Imagix Corporation [3], used primarily for understanding, documenting and evolving existing C, C++ and Java software.Imagix-4D applied technologies include full semantic source analysis.Software visualization supports program comprehension.Static data flow analysis-based verifications detect problems in variable usage, task interactions and concurrency.Software metrics measure design quality and identify potential testing and maintenance issues.The Imagix-4D Reverse Engineering Tool has some inadequacies.That illustrates only abstract Class Diagram which is not easily understood by other developers and users.Imagix-4D does not illustrate ER-Diagram and Sequence Diagram.In this paper we highlight only first inadequacy of Imagix-4D reverse engineering tool.

Tool Selection Criteria
In this section we will describe the applied tool selection criteria, the reasons why we have selected particular tools into the study their basic characteristics [4] and Table 1 features extension.2) The tool provides supporting Capabilities (e.g.filters, metrics, groups, etc.) and it is extensible in some way.
3) The only tool that allows to save the generated views and represent at-ions.
1) The major drawback of Rigi is the provided parser which can only parse functions and structure data Types.
2) This limits the views that can be generated mainly to functional views (call graph).
3 2) The most important feature for user acceptance of Solidsx is integration ease.3) Solidsx was used in several industrial reverse engineering and program comprehension Projects.
1) Tool is too generic; needs customized wizards that should address specific questions [13].
Dalli [14] Compliance Full coverage Completeness Scalability Portability Language independence 1) Dalli is recoverable because parsing and lexical technique which is highly versatile.
2) Dalli tool is versatile in light weight then other base technique.
1) It provides low accuracy.
2) Dalli itself cannot extract the complete source code as there is no one tool that can successfully extract the complete source code/ architecture model.3) Dalli tool required to preprocessing as it allows and analyst to interact with the recovered information by accessing the result of reconstruction effort.
2) This model implies (conceptual model) the structure of the graph-based GUPRO-repository.Source code is extracted into the repository and the repository graphs can be viewed by an integrated querying and browsing facility.
1) Due to large software system all facts are source cannot fill at once due to Limited repository size, fact extractors for multi-languages systems follow a four step parsing approach [17].
DEFCTO [18] Fault 3) It is compliance because it is highly adoptable from user as it is a professional tool covering Reverse Engineering Tool in a single package [20].
1) Costly and not ease to availability.1) It provides views to rapidly check and systematically study software.

Imagix-4D
2) Presents key information on software in a 3D-graphical format which enables the user to quickly focus on particular areas of interest.
3) It helps software developers comprehend complex or legacy C, C++ and Java source code.4) By using Imagix-4D to reverse engineer and analyze our code, we are able to speed your development, enhancement, reuse, and testing.5) It eliminates bugs due to faulty understanding.
6) It enables us to rapidly check or systematically study your software on any level from its high level architecture to the details of its build, class and function dependencies.7) We can visually explore a wide range of aspects about your software-control structures, data usage, and inheritance.All based on its precise static analysis of your source.8) Using this tool we are able to find and focus on the relevant portions of your source code through its querying capabilities.9) Using this tool we are able to find and focus on the relevant portions of your source code through its querying capabilities [21].
1) The disadvantage of smaller graph is that highly connected graphs get complicated and unreadable. 2

Tool Selection Criteria
Because there are numerous tools for reserve engineering purposes it is not possible to analyze all of them in a single study.We have decided to focus on some properties of those tools Table 1 show details fruition of properties they are: well-known freely available tools which support C, C++ or Java languages.The languages have been selected since they are among both the most commonly used and supported ones.The selected tools should also be either under active current development or be related to scientific publications of software maintenance.
The C programming language is still very important in this context since it is used in numerous important legacy systems which are under maintenance.It is also the only language for which there exist multiple empirical studies on information needs [1] [4].Object-orientation (OO) is important in the development of new sys-tems which will be legacy system in the future.The most commonly used OO-languages include C++ and java.Most of the reserve engineering tools support C language.Some of them support also at least some of the OOlanguages, most notably C++ or Java, so on bases on above basic properties of the tools we will select Imagix-4D Reverse Engineering Tool environment.

Proposed Reverse Engineering Tool Based on Unified Mapping Method (RETUM)
The below architecture in Figure 1 is proposed for applying reverse engineering on legacy codes of C, C++ class libraries of object oriented or procedure oriented codes.Thus initially the code samples are passed into the code analysis module.This code analysis module takes the code of various languages and makes them separated according to the type of keyword used and store them into a temporary storage.There has been considerable progress in code analysis phase for C, C++, Java, and COBOL.Code analysis phase parse source or intermediate (e.g., byte code) code and produce a database of code entities (e.g., Functions and variables) and relationships (e.g., method invocation, Number of calls, Inheritances, Interfaces, Classes Associations, Aggregations and object instantiation).Form here a symbol tree is constructed for correct analysis of tokens according to their uses in codes.Then the work generates various tokens for mapping.These tokens acts as a data extraction components form source codes.In proposed system, there are totally ten components needs to be extracted for accurate mapping of different entity relationships, class and objects instances.
After these components is correctly extracted from UML mining module then a local parse tree is generated and the information is stored in repository for its further usage.Now the direct mapping is possible after this phase but to customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi-View.This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of Class diagram, Sequence diagram or Call graph as an output.
After analysis it seems that in near future, suggested tool will proves its efficiency and usability in terms of its language supportability (C++/C# and, Java) diagram supportability input range (Class and Activity), detection and mapping mechanism (Various Parameters for accurate mapping).After applying the updated concepts at initial level of work, it is identified that the approach will proves as an unambiguous UML generation from source code and is more accurate, easy and complete.

Proposed Algorithm Reverse Engineering Tool
We proposed a algorithm for design Reverse engineering tool of RETUM.
Step 1: First we take legacy codes (object oriented or procedure oriented codes) as input.
Step 2: Legacy code samples are passed into the code analysis module as input.These code analysis modules takes the code of various languages and makes them separated according to the type of keyword used and store them into a temporary storage and symbol tree is constructed for correct analysis of tokens according to their uses in codes.
Step 3: Next step the takes input from code analysis phase and generate token with the help of token generator (generates various tokens for mapping).
Step 4: These tokens acts as a data extraction components form source codes.Extraction components needs to be extracted for accurate mapping from UML mining of different entity relationships, class and objects instances.
Step 5: After these components is correctly extracted from mining module UML mining then a local parse tree is generated and the information is stored in repository for its further usage.
Step 6: Now for the customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi-View.
Step 7: This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of object oriented diagram or procedure oriented diagram as an output.
We realize of above algorithm for design simplification adaptation of class diagram.

Algorithm for Class Diagram Visualizations
Step 1: Initially starts with legacy code or source code as input.
Step 2: Here we take the specific java file as input.
Step3: The UML Doclet API will process the java file (Any additional UMLGraph or javadoc arguments can be added at the end of the command line.This command will read the specification file (e.g.Test.java) and generate directly a diagram of the appropriate type).
This option provides the maximum flexibility.In order to run, javadoc needs to access tools jar. 1. Specify the location of tools.jaras a part of Java's classpath and specify the full name of the UML Graph doclet as an argument to Java.This is an invocation example under Windows java -classpath"lib/UmlGraph; jar, c:\program files\java\jdk 1. Step 5: After step 4, the Maven API is added by UML Doclet.
Step 6: The class diagram is generated and display to the user.Above algorithm specific used for class diagram generation, which take input as java file and produce output as graphical form details in appendix.

Conclusion
In this research paper, we investigate various features of Imagix-4D, and concentrate on class diagram visualization of Imagix-4D.In Imagix-4D class diagram visualization which is more complex, it is not easy to understand a proposed tool RETUM which works on this inadequacy of above tool and illustrates simple comprehensive Class Diagram and we will propose here extension of Imagix-4D Reverse Engineering Tool to draw sequence diagram and ER-Diagram which are Extend Feature of Imagix-4D.

Figure 3 .
Figure 3. Dialog box alerting to add file.

Table 1 .
Behavioral and analytical comparison of existing reverse engineering tools.