Reverse Engineering Tool Based on Unified Mapping Method (RETUM): Class Diagram Visualizations ()
1. Introduction
Understanding the intricate relationships that exist between the source code components of a software system can be an arduous task. In the preceding years, several tools [1] have emerged to support program understanding, software maintenance, reverse engineering, and reverse engineering activities. A large part of such tools extract their information mainly from the source code via static analysis. This includes a set of operations ranging from code parsing and fact extraction, fact aggregation and querying, up to interactive visualization. Many requirements were met through previous Reverse Engineering Tool that was accepted by software industry, for designing purpose. Reverse Engineering Tool (Figure 1) is important for Lexical analyzer or scanner, its function to read the source program in the form of character stream and also grouping the logically related characters together that are known as lexemes. Syntax analysis: parser uses the token_name taken from the token stream to generate the output in the form of a tree-like structure known as syntax tree or parse tree and semantic analysis: semantic analyzer uses the parse tree and symbol table for checking the semantic consistency of the language definition of the source program. The main function of the semantic analysis is type checking in which semantic analyzer checks whether the operator has the operands of matching type. Next phase is intermediate code generation phase: in intermediate code generation phase, the parse tree representation of the source code is converted into low-level or machine-like intermediate representation. Next phase is symbol table which is a data structure used by the compiler to record and collect information about source program constructs like variable names and all of its attributes, which provide information about the storage space occupied by a variable (name, type, and scope of the variables). A symbol table should be designed in an efficient way so that it permits the compiler to locate the record for each token name quickly and to allow rapid transfer of data from the records. Next phase is Error handler: Error handler is invoked whenever any fault occurs in the compilation process of source program. Both the symbol table management and Error handling mechanisms are associated with all phases of the compiler.
When assessing the superiority and maintainability of large C, C++ and Java source code bases, tools are needed for extracting several facts [2] from the source code, such as: Language Support, Pre-Processing, Lexical Analysis, Parsing, Repository and Extracting capabilities of tools.
In this paper, we present our experience in the architecting of Imagix-4D that is a source code analysis tool from Imagix Corporation [3] , used primarily for understanding, documenting and evolving existing C, C++ and Java software. Imagix-4D applied technologies include full semantic source analysis. Software visualization supports program comprehension. Static data flow analysis-based verifications detect problems in variable usage, task interactions and concurrency. Software metrics measure design quality and identify potential testing and maintenance issues. The Imagix-4D Reverse Engineering Tool has some inadequacies. That illustrates only abstract Class Diagram which is not easily understood by other developers and users. Imagix-4D does not illustrate ER-Diagram and Sequence Diagram. In this paper we highlight only first inadequacy of Imagix-4D reverse engineering tool.
2. Tool Selection Criteria
In this section we will describe the applied tool selection criteria, the reasons why we have selected particular tools into the study their basic characteristics [4] and Table 1 features extension.
Figure 1. Reverse Engineering Tool based on Unified Mapping Method (RETUM).
Table 1. Behavioral and analytical comparison of existing reverse engineering tools.
Tool Selection Criteria
Because there are numerous tools for reserve engineering purposes it is not possible to analyze all of them in a single study. We have decided to focus on some properties of those tools Table 1 show details fruition of properties they are: well-known freely available tools which support C, C++ or Java languages. The languages have been selected since they are among both the most commonly used and supported ones. The selected tools should also be either under active current development or be related to scientific publications of software maintenance.
The C programming language is still very important in this context since it is used in numerous important legacy systems which are under maintenance. It is also the only language for which there exist multiple empirical studies on information needs [1] [4] . Object-orientation (OO) is important in the development of new systems which will be legacy system in the future. The most commonly used OO-languages include C++ and java. Most of the reserve engineering tools support C language. Some of them support also at least some of the OO- languages, most notably C++ or Java, so on bases on above basic properties of the tools we will select Imagix- 4D Reverse Engineering Tool environment.
3. Proposed Reverse Engineering Tool Based on Unified Mapping Method (RETUM)
The below architecture in Figure 1 is proposed for applying reverse engineering on legacy codes of C, C++ class libraries of object oriented or procedure oriented codes. Thus initially the code samples are passed into the code analysis module. This code analysis module takes the code of various languages and makes them separated according to the type of keyword used and store them into a temporary storage. There has been considerable progress in code analysis phase for C, C++, Java, and COBOL. Code analysis phase parse source or intermediate (e.g., byte code) code and produce a database of code entities (e.g., Functions and variables) and relationships (e.g., method invocation, Number of calls, Inheritances, Interfaces, Classes Associations, Aggregations and object instantiation). Form here a symbol tree is constructed for correct analysis of tokens according to their uses in codes. Then the work generates various tokens for mapping. These tokens acts as a data extraction components form source codes. In proposed system, there are totally ten components needs to be extracted for accurate mapping of different entity relationships, class and objects instances.
After these components is correctly extracted from UML mining module then a local parse tree is generated and the information is stored in repository for its further usage. Now the direct mapping is possible after this phase but to customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi- View. This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of Class diagram, Sequence diagram or Call graph as an output.
After analysis it seems that in near future, suggested tool will proves its efficiency and usability in terms of its language supportability (C++/C# and, Java) diagram supportability input range (Class and Activity), detection and mapping mechanism (Various Parameters for accurate mapping). After applying the updated concepts at initial level of work, it is identified that the approach will proves as an unambiguous UML generation from source code and is more accurate, easy and complete.
4. Proposed Algorithm Reverse Engineering Tool
We proposed a algorithm for design Reverse engineering tool of RETUM.
Step 1: First we take legacy codes (object oriented or procedure oriented codes) as input.
Step 2: Legacy code samples are passed into the code analysis module as input. These code analysis modules takes the code of various languages and makes them separated according to the type of keyword used and store them into a temporary storage and symbol tree is constructed for correct analysis of tokens according to their uses in codes.
Step 3: Next step the takes input from code analysis phase and generate token with the help of token generator (generates various tokens for mapping).
Step 4: These tokens acts as a data extraction components form source codes. Extraction components needs to be extracted for accurate mapping from UML mining of different entity relationships, class and objects instances.
Step 5: After these components is correctly extracted from mining module UML mining then a local parse tree is generated and the information is stored in repository for its further usage.
Step 6: Now for the customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi-View.
Step 7: This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of object oriented diagram or procedure oriented diagram as an output.
We realize of above algorithm for design simplification adaptation of class diagram.
5. Algorithm for Class Diagram Visualizations
Step 1: Initially starts with legacy code or source code as input.
Step 2: Here we take the specific java file as input.
Step3: The UML Doclet API will process the java file (Any additional UMLGraph or javadoc arguments can be added at the end of the command line. This command will read the specification file (e.g. and generate directly a diagram of the appropriate type).
This option provides the maximum flexibility. In order to run, javadoc needs to access tools jar.
1. Specify the location of tools.jar as a part of Java’s classpath and specify the full name of the UML Graph doclet as an argument to Java. This is an invocation example under Windows java -classpath”lib/UmlGraph; jar, c:\program files\java\jdk 1.6.0_02\lib\Tools.jar” org.umlagraph.doclet.Uml Graph – package and under Unix java -classpath ‘/usr/share/lib/UmlGraph.jar:/opt/java-1.6/lib/tools.jar’\org.umlgraph.doclet. UmlGraph -package
2. Place the UmlGraph.jar file in a directory that also contains the Java SDK tools.jar -jar /path/to/ UmlGraph.jar yourfile
Step 4: The UML graph & UML tool API will extract the relevant data from java file.
javadoc -docletpath UmlGraph.jar -doclet org.umlgraph.doclet.UmlGraph -private
4.1 Add command line option umlgen (generates UML diagrams if the source documentation contains) and umltypegen (generates UML diagrams for all documented classes and interfaces).
4.2 Add command line umlpackagegen (generates UML diagrams for all documented packages).
4.3 Add command line umloverviewgen (generates project overview UML diagrams).
4.4 Add command line umlautogen (generates all types of UML diagrams).
Step 5: After step 4, the Maven API is added by UML Doclet.
Step 6: The class diagram is generated and display to the user.
Above algorithm specific used for class diagram generation, which take input as java file and produce output as graphical form details in appendix.
6. Conclusion
In this research paper, we investigate various features of Imagix-4D, and concentrate on class diagram visualization of Imagix-4D. In Imagix-4D class diagram visualization which is more complex, it is not easy to understand a proposed tool RETUM which works on this inadequacy of above tool and illustrates simple comprehensive Class Diagram and we will propose here extension of Imagix-4D Reverse Engineering Tool to draw sequence diagram and ER-Diagram which are Extend Feature of Imagix-4D.
Appendix 1: Discussion and Enlightenment of Class Diagram Tool Phase: Class diagrams characterizing the static data and class structure of Java source code. To achieve such a diagrammatic representation, translation rules are defined that transform Java syntax into class diagram.
This diagram is showing Figure 2 the output of our code when we click on attach button then it give the following output, it must be noted here that this attachment will only accept the .java extension file.
This dialog box will appear Figure 3 when user will try to click the convert button without selecting the java file. Also this will notify him to add file only after this issue further procedure will be carried out.
This dialog box will appear Figure 4 only when the attach button is click and by using the browse button user will be able to add the appropriate file i.e., this is simply for choosing a file from documents.
Similarly, this dialog box will appear Figure 5 when a file is being attached here, after clicking the attach button a java file named is uploaded.
Here when we click on convert button after attaching the required file (.java) then we get a dialogue box showing in Figure 6 that the diagram has been created also with the destination address where the diagram is created in the memory.
class Person {
String Name;
public static void main(String a[]){}
class Employee extends Person {
public static void main(String a[]){}
class Client extends Person {
public static void main(String a[]){}
Standard Class Diagram Generated by RETUM Tool
In Imagix-4D generated class diagram Figure 7 program doesn’t show of internal details of classes’ attributes, member function, access mode and data type ,so class diagram generated by Imagix-4D is not understandable by navies developer/user which play crucial role in software quality. Standard class diagram generated by RETUM tool, class diagram which is obtained when the source code named is made to attach and the converted. This diagram shows in Figure 8 the relationship among the classes and also it is showing the inheritance relationship between them. It must be noted here that a default constructor has been created here but it was not included in the source code because the program doesn’t contain any default constructor but whenever a program is made to run firstly its default constructor gets executed. In this diagram there are three classes which are named as Person, Employee and Client. Among them the Person is the super class and the rest of the classes namely Employee and Client are sub classes. Person class consists of Person() function, where as the Employee class consists of Employee() and main() similarly the client class consist of client() and main() function responsible for its execution.
Figure 3. Dialog box alerting to add file.
Figure 7. Class diagram generated by Imagix-4D.
Figure 8. Class diagram generated by RETUM Tool.