An Effective Network Traffic Data Control Using Improved Apriori Rule Mining

The increasing usage of internet requires a significant system for effective communication. To provide an effective communication for the internet users, based on nature of their queries, shortest routing path is usually preferred for data forwarding. But when more number of data chooses the same path, in that case, bottleneck occurs in the traffic this leads to data loss or provides irrelevant data to the users. In this paper, a Rule Based System using Improved Apriori (RBS-IA) rule mining framework is proposed for effective monitoring of traffic occurrence over the network and control the network traffic. RBS-IA framework integrates both the traffic control and decision making system to enhance the usage of internet trendier. At first, the network traffic data are analyzed and the incoming and outgoing data information is processed using apriori rule mining algorithm. After generating the set of rules, the network traffic condition is analyzed. Based on the traffic conditions, the decision rule framework is introduced which derives and assigns the set of suitable rules to the appropriate states of the network. The decision rule framework improves the effectiveness of network traffic control by updating the traffic condition states for identifying the relevant route path for packet data transmission. Experimental evaluation is conducted by extracting the Dodgers loop sensor data set from UCI repository to detect the effectiveness of the proposed Rule Based System using Improved Apriori (RBS-IA) rule mining framework. Performance evaluation shows that the proposed RBS-IA rule mining framework provides significant improvement in managing the network traffic control scheme. RBS-IA rule mining framework is evaluated over the factors such as accuracy of the decision being obtained, interestingness measure and execution time.


Introduction
In internet system, routing path is one of the significant characteristics to be analyzed for efficient communication between the users.Furthermore, router is one of the equipments in internet that runs and transports the data during the communication process.Most of these approaches employ direct path algorithm for effective communication between the sender and receiver of packet data.These algorithms deploy the direct method to send the packet to destination but the technique dose not focus on the traffic issues.Consequently straight path might not be the finest solution as the direct path which is preferred from router contains excess traffic.
The measure of network traffic turns out to be the most significant form of research areas and a problem that has to be solved.In specific, prediction of network traffic during the daytime can assist to direct route path for measuring the network effectiveness and recognizes the path that does not contain load.As a result, several researchers employed rule mining framework to forecast network traffic during the daytime and generate representation to supervise the network passageway for increasing the efficiency.
Traffic control system is one of the most important areas, where decisive data concerning the well-being of the society has to be verified and solved out.Different features of traffic scheme like vehicle disaster, traffic level and attention are confirmed at different stages.In association to this, the increasing casualties resulting from traffic accident is one of the areas of concern.Pertaining to the data mining techniques to form traffic data records that can assist to measure the uniqueness of the user behavior; traffic condition can be in a way highly associated with different injury severities.This can assist decision makers to design better traffic security control guidelines.
The objective of mining rules is to recognize associations among items in such databases as marketplace basket databases.As far as the network traffic data are concerned, a record in the network traffic data is prepared with certain practical values correlated to precise network attributes.The attributes for network traffic data include the following: the source IP address, the destination IP address, the source and destination IP port, the procedure type, the start time, and the finish time.Mining rules from the network traffic data can assist and discover the standard patterns during the network communication.
The traffic dimensions speed and the occupancy of data has been observed as the significant characteristics in traffic control and information organization schemes.Supported with these traffic features, it is probable to expand representations for forecasting and extrapolate the approaching traffic states.In common, the number of models contains immense authority on the decision-makings.On the other hand, in genuine world, the traffic data are intense multifaceted and the high proportions of the data build standard statistical methods incompetent to give a comparatively good choice for the traffic forecasting and organization.
For the effective controlling of traffic data over the network environment, in this work, rule mining framework is implemented using the improved apriori algorithm.In the first part of our work, rules are derived based on the traffic conditional state which is comprised of i) preprocessing network traffic data ii) generating logical rules using improved apriori rule mining framework.To address the decision making procedural framework to enhance the usage of internet, the second part of our work comprises of iii) applying rules based on the traffic condition using decision rule framework and iv) improve traffic control systems.
The rest of the paper is organized as follows: Section 2 discusses related work.Section 3 network traffic data monitoring system is illustrated to integrate both the traffic control and decision making system to enhance the usage of internet trendier using the improved apriori rule mining algorithm.Section 4 presents experimental setting with parametric description.Section 5 represents the results for evaluating the proposed RBS-IA framework.Finally, Section 6 gives concluding remarks.

Literature Review
Association rules are used to identify pattern or relation among the attribute of data base.Association rules are important for the analysis of network traffic.Several association rules algorithms have been proposed by researchers for network traffic monitoring and traffic data control.Scalable Network Traffic Monitoring and Analysis System (SNTMAS) in [1] were accomplished to monitoring and analyzing the network traffic for both the Intranet and the Internet traffic.However, the network performance is not efficient.Additionally, Wireless Monitoring and Shielding Technique (WMST) [2] was developed to monitor the network data transmission that allowing users to understand the situation of device.But, it does not effectively control the network traffic.
Traffic measurement over an entire network is faced with many challenges because of the rapid growth of network size.Traffic Congestion Evolution Prediction Using Deep Learning in [3] developed a data-driven method to predict the network traffic congestion onset and evolution patterns.However, training and prediction accuracy are not at required level.In [4], we study the effect of network traffic on detection of flooding DOS attack where analyze rate reduction decreased the snort's functionality in detection and prevention of flooding DOS attack.Traffic Flow Analysis Using Mobile Phone Data in [5] developed a method based on call detail records of mobile phones in order to assess the composition of actual traffic flow.However, the dominant share of trips by all road users is not related to the workplace-home movement.Random Matrix Theory (RMT) and Principal Components Analysis (PCA) developed in [6] for monitoring and analyzing large-scale traffic patterns in the Internet.However, real networks with hundreds of routers typically have large-scale.Intelligent Decision Support System (IDSS) in [7] was designed for traffic congestion management that reduces the dependability on the expertise.In addition, new architectural model designed in [8] contains multiple loop delay that increases the throughput and maintains proficient use of the buffer under Poisson traffic loading.But, the increment of throughput will sustain up a certain value of incoming traffic.Information theory and data mining techniques illustrated in [9] was used to extract knowledge of network traffic behavior for packet-level and flow level that can be applied for traffic profiling in intrusion detection systems.But, the effect on variation of the slot duration is not considered.A possible approach to dynamic traffic control allowing for variable route choices was summarized in [10].Control adjustments used in this method provides a stable dynamical system and maximizes the throughput.However, the capacity of dynamic networks is not maximized.In [11], the author extends the splitting rate re-routing framework so as to include the dynamic re-assignment of green time in addition to traffic flows.But the attacks related to the web services were unaddressed.
An efficient mining algorithm in [12] discovered the set of patterns with a rational and suitable time frame.With the set of patterns produced by data mining approaches, the process of employing and informing these patterns is still an open research issue.Mining fuzzy association rules has been processed to provide solutions to two important problems.In [13], the author provided a design for mining circuitous weighted association rules and a detection algorithm for mining both straight and not direct weighted fuzzy association rules by combining three extensions.The author addressed this crisis by locating least amount support with prejudiced minimum support so as to preserve goods that if an item set is biased then recurrent the item set under the subjective minimum support then the item set must be a recurrent item set under the slanted minimum support.A fuzzy data mining algorithm was presented to detect fuzzy association rules for prejudiced quantitative data [14].This was anticipated to be more pragmatic and realistic than brittle association rules.This technique suits descending closure goods, which also results in the minimizing of execution time.
Authentication validates users' identity in advance to access the web services.To improve the security aspects of web services, several researchers have followed different types of authentication protocols to prevent the malicious activity from accessing the web services.The paper [15] concentrate on the use of packet capturing technology such as WinPcap and JPCAP for the intentions of enterprise network traffic monitoring and reporting.In [16], Supported with the Apriori algorithm, an enhanced algorithm IAA was accessible with the rapid enlargement in universal information with the effectiveness of association rules mining (ARM) has been in problem for numerous years.IAA adopts a novel count-based technique to clip candidate item sets and utilized production record to decrease total data scan quantity.Experiments reveal that the algorithm outperformed the innovative Apriori and some other existing ARM techniques.
Agent Base Network Traffic Monitoring was presented in [17] to monitors the network for problems caused by overloaded or crashed servers, network connections or other devices.In [18], the issues related to Network Traffic Management were addressed.The World Wide Web is a significant source of data that are obtained either from the Web content, or from the Web usage, collected in a regular basis by all the servers around the world.In [19], the author designed an efficient system in order to predict the traffic flow during peak hours with the help of machine learning techniques using aerial images which was obtained using geospatial data.It also proved that the flow of traffic predicted was excellent than the state-of-art methods.
With a number of soft real-time applications in existence, including ecommerce and target tracking, it is highly required to process the requests of the users in a timely manner irrespective of the data traffic being observed.In [20], description approach was designed to predict incoming and outgoing data rate in network system by using association rule discover.

Proposed Methodology
Rule based system to improve the network traffic data control is performed using improved apriori rule mining framework.The entire process of rule based systems for network traffic control is shown in Figure 1.
The rule based system includes both the logical rule mining technique and decision rule framework for providing continuous flow of data in the network environment.The rule based systems is processed under two phases.With the network traffic occurring on different set of conditions, the first phase describes the process of deriving the logical set of rules from the extracted network traffic data.With the logical set of rules extracted from the network traffic data, the application of suitable rule to appropriate set of traffic state, decision rule framework is introduced in the second phase.
The processes followed in the Rule Based System using Improved Apriori (RBS-IA) rule mining framework for network traffic control is i) Preprocessing network traffic data ii) Generating logical rules using improved apriori rule mining framework iii) Applying rules based on the traffic condition using decision rule framework and iv) Improve traffic control systems.The elaborate process involved in designing the framework is given below.

Preprocessing Network Traffic Data
The first process involved in Rule Based System using Improved Apriori (RBS-IA) rule mining framework is preprocessing the network traffic data.Usually the real-world data in the dataset are imperfect, strident, and incoherent which forms the general properties of genuine databases and data warehouses.The occurrence of these types of properties is processed based on different factors and the necessary attributes of the specified dataset are not accessed directly.In addition to this, the processing or updation of the data might have been overlooked.The data sometimes would have been missed in the dataset can also be utilized for set of tuples for certain set of attributes.As a result, the network traffic data is preprocessed that involves more forms, namely data cleaning, data transformation, data integration and data reduction.
Preprocessing the network traffic data in RBS-IA provides the way to specify the data with the most required attributes using cleaning algorithms.The data transformation in the preprocessing stage of network traffic data specifies a preprocessing utility with a set of procedure that updates the genuine form of the data.The proper transformation of data in the RBS-IA is processed by dividing the continuous numerical domain into set of time intervals, receiving the discrete attributes needed by the algorithms.The data integration for network traffic data in RBS-IA permits the user for the process of achieving the clear set of data for an increased data processing and identification of new and interesting associations among the set of obtained features using data reduction.

Generating the Logical Rules Using Improved Apriori Rule Mining Framework
Once the network traffic data are preprocessed, logical rules have to be generated.By deriving the rules in rule mining framework, the set of items can either be considered or not considered.As a result, an inference over the set of items is assigned as either is true or false in RBS-IA rule mining framework.The presence of the itemset in the given inference is referred with the records stored in the network traffic data transactional database.Using the improved apriori rule mining [16], the mapping of itemset to the appropriate network traffic data in the dataset is then processed.The derived set of rule in RBS-IA is then mapped to an implication only if it has a true set of values.
Using the improved apriori rule mining framework, the occurrence of network traffic data is monitored.To start with in RBS-IA, the network traffic data occurrence is classified into five categories, i) Low level network traffic (T1), ii) Above low level network traffic (T2) iii) Medium level network traffic (T3) iv) Medium high level network traffic (T4) and v) High level network traffic (T5).With the above assumptions, based on these state of traffic conditions, the network traffic is said to occur in RBS-IA.Here, the traffic condition states are assumed to be consistently disseminated.Now, with the help of improved apriori rule mining, the association between the itemset is defined and processed.The rules which are extracted from the network traffic data are specified on the basis of 1-hour periods of time.The network traffic prediction using improved apriori rule mining framework is specified in Figure 2.
The formation of derived set of logical rules obtained from improved apriori rule mining is used in RBS-IA to monitor the network traffic data.Followed by it's the logical rules are then processed with the pseudo implications.Let us consider a time interval of t' to observe the network traffic data over the network environment.Consider  1.
For the above sample set of network traffic data in RBS-IA, the rules are derived to be of the form LHS RHS → .For each rule, the improved apriori rule mining obtains the values such as support (sup) and confidence (con) with number of itemsets (N) is as given below: Such as confidence value is specified as time (T) with the occurrences of traffic patterns given as below ( ) where LHS and RHS are the obtained results for N itemsets.In a similar manner, the support value is specified as time (T) with the traffic conditional state (TC) for n state is as given below The equations above describe the set of derived rules for monitoring the network traffic at 1 PM.Here for deriving the set of rules from network traffic data the improved apriori rule mining uses the confidence and support values.By following this, several set of rules have been derived for RBS-IA which include i) Choose the rule with maximum confidence ii) Choose the rule with maximum support and iii) Choose the rule that contains confidence and support values to be equal.
With the above specified condition, the rules are specified as in [20], 1) Support (time→traffic conditional state) is the possibility that network traffic has both time and traffic conditional values 2) Confidence (time→traffic conditional state) is the possibility that the network traffic state specified that the content comes out in time.
The network traffic monitoring system for RBS-IA with the obtained confidence and support values are illustrated in Table 2.
With the above set of confidence and support values, the logical rules are processed in RBS-IA and used as the basis for the network traffic data control.

Decision Rule Mining Framework
With the above set of rules obtained using improved apriori rule mining, the rules are applied on the basis of decision rule mining framework.As the network traffic data are only monitored by the system, the decision rule mining framework is applied in RBS-IA to enhance the filtering process.The decision rule mining framework in RBS-IA decides the process of application of rules based on the traffic conditional.Accordingly, decision rule is defined as, ( ) ( ) where ( ) dr X signifies the traffic conditional part of the network with α symbolizing the decision part of the rule for network traffic data X.The condition part of the decision rule in RBS-IA is expressed as a combination of syntactic conditions on the basis of distinct set of attributes.A basic form for the attribute where i X is the value of an object X (including time, date, incoming packet data, usage of data and traffic con- dition state) on the attribute i with i S being a subset of a domain of this attribute.The conditional part of the decisive rule is presented as , X S X S ≥ ≤ , for quantitative set of attributes in the itemset (time, date, incom- ing packet data, usage of data), and i i X S = for qualitative attributes (traffic condition state), where i S is de- termined as the domain of the attribute i.The decision (or response) α , is a genuine value allocated to the val- ue expressed as i X .Based on the processes, the rules are shared jointly, the analysis of α is diverse.With these set, the traffic conditions are analyzed and the change of prediction of the rule formation is applied.The pseudo code below describes the entire process of prediction of network traffic data occurrence based on the improved apriori rule mining framework.
The above pseudo code describes the process of application of rule based systems for network traffic data prediction using the improved Apriori rule mining framework.RBS-IA framework initially extracts the set of items in dataset.Then, select those items for generating the logical rule using improved apriori rule mining.After that, RBS-IA framework computes the support and confidence value of logical rule for forming a set of logical rules.With the help of set of derived logical rules, RBS-IA framework effectively decide which rule has to be applied based on the traffic condition state for controlling the traffic occurrence over internet which result in improved accuracy of decision being obtained.

Experiment Evaluation
In this section, the experiments are conducted to estimate the performance of the proposed RBS-IA framework.For evaluation of experiments, the dodger's loop sensor dataset is extracted from UCI repository and made for processing of data.The Dodgers loop sensor data was collected for the Glendale on ramp for the 101 North freeway in Los Angeles.It is secured as much as necessary to the pitch for observing abnormal traffic after Dodgers game, but not so secure and deeply utilized by game traffic so that the signal for the extra traffic is excessively noticeable.The description over Dodgers loop sensor dataset is specified in Table 3.
The attribute information of the Dodgers loop sensor dataset is shown in Table 4.
With these set, the experimental evaluation is conducted to estimate the performance of the proposed RBS-IA rule mining framework.The performance of the rule based systems is measured in terms of i) Accuracy, ii) Measure of interestingness and iii) Execution time.

Result Analysis
The performance of RBS-IA framework is compared against with exiting two methods namely, Scalable Network Traffic Monitoring and Analysis System (SNTMAS) [1] and Wireless Monitoring and Shielding Technique (WMST) [2].The performance of RBS-IA framework is evaluated along with the following metrics.

Impact of Accuracy
Accuracy refers to the effectiveness of deriving the set of logical rules for the specified network traffic data and the process of effective decision over the set of data based on the traffic condition state.It is measured in terms of percentage (%).In this work, we efficiently evaluate the derived set of logical rules established using the proposed RBS-IA framework.Figure 3 given below demonstrate the accuracy of the logical rule formation measured based on derived set of rules from itemset in the dataset and decision making process to decide which rule has to be applied based on the traffic condition state.The performance of the proposed RBS-IA framework is compared with the existing SNTMAS [1] and WMST [2].
Figure 3 describes the accuracy of the logical rule formation measured based on the derived set of rules from the itemset in the dataset.The Figure shows that with the increase in the number of rules, the accuracy  also gets increased.Also, there is a steep increase in accuracy using the proposed RBS-IA when compared to the existing SNTMAS [1] and WMST [2].This is because RBS-IA framework uses the improved apriori algorithm obtain the result on the basis of network traffic conditions.As a result, The RBS-IA framework improved the accuracy by 6% when compared with the SNTMAS [1] and 13% when compared to the WMST [2] respectively.The existing SNTMAS and WMST did not consider the traffic whereas the RBS-IA framework initially derives the logical rules for the given itemset.Then to enhance the logical rule processing scheme, the decision making procedure is implemented to decide which rule has to be applied based on the traffic condition state.

Impact of Interestingness
The interestingness of rules for RBS-IA is measured based on the support and confidence values of logical rules.It is measured in terms of percentage (%).The measure of interestingness and strength of the derived logical rules are measured based on the number of items present in the dataset.The value of the proposed RBS-IA framework is compared with the existing Scalable Network Traffic Monitoring and Analysis System (SNTMAS) [1] and Wireless Monitoring and Shielding Technique (WMST) [2]. Figure 4 describes the interestingness and strength of the derived rules measured based on the number of items present in the dataset.Compared to the existing SNTMAS [1] and Wireless Monitoring and Shielding Technique WMST [2], the proposed RBS-IA framework provides higher interestingness rate and derives the logical rules with high strengthen rate.This is because the RBS-IA framework using the formation of logical rules and it has been sorted out based on the traffic condition state.As a result, the proposed RBS-IA framework improved the interestingness rate by 13% when compared with the SNTMAS [1] and 20% when compared to the WMST [2] respectively.

Impact of Execution Time
The time taken to generate the set of logical rules for network traffic prediction and discovered the decision making process for identifying the reason for the occurrence of network traffic data.It is measured in terms of seconds.It is expressed as, n ExeT t = where, n -signifies the total number of logical rules, t -Time interval taken for decision making process.[1] and Wireless Monitoring and Shielding Technique (WMST) [2], the proposed RBS-IA framework consumes less time to monitor and formation of logical rules.This is because of the application of improved apriori algorithm in proposed RBS-IA framework.Improved rule mining algorithm integrates the both traffic control and decision making system to enhance the usage of internet and then it decides the application of rule based on traffic condition states that in turn reduces the time taken to execute.
The RBS-IA framework reduced the execution time by 14% when compared with the SNTMAS [1] and 36% when compared with the WMST [2] respectively.Finally, it is being observed that the occurrence of network traffic data is monitored and analyzed using the improved apriori rule mining and derived the set of logical rules based on the traffic condition states.

Conclusion
Network traffic data monitoring is performed effectively by adapting the rule based systems.In this work, the rule based system utilizes two sets of approaches: one is rule mining framework which monitors and analyzes the presence of network traffic depending on the arrival and usage of data in the environment.Then the decision making procedure is utilized which decides the application rule into the set of items based on the traffic condition states using improved apriori rule mining framework.This work was examined with the association of the amount of traffic on the network and established that logical rule techniques to predict network traffic.Experimental evaluation is conducted with the Dodgers loop sensor dataset extracted from UCI repository and it performance is measured in terms of execution time, accuracy and interestingness measure.Performances results reveal that the proposed RBS-IA rule mining framework provides higher level of accuracy and also strengthen logical rules by consuming less time to predict the network traffic.The proposed RBS-IA framework provides 10% accuracy of the logical rule formation measured based on derived set of rules for controlling the network traffic data when compared to the state-of-the-art works.

Figure 1 .
Figure 1.Architecture diagram of the proposed RBS-IA.
T T = be the traffic condition state of the network in which the user access the packet data with the reason for traffic occurrence are based on the specified term.Consider a network traffic dataset NTD, consists of set of items to generate the logical rule in RBS-IA includes the values such as time, date, incoming packet data, usage of data, traffic condition state.The itemsets of the sample network traffic dataset is shown in Table

Figure 2 .
Figure 2. Network traffic monitoring process using improved apriori rule mining framework.

Figure 5
Figure5describe the execution time measured based on the number of items present in the dataset.Compared to the existing Scalable Network Traffic Monitoring and Analysis System (SNTMAS)[1] and Wireless Monitoring and Shielding Technique (WMST)[2], the proposed RBS-IA framework consumes less time to monitor and formation of logical rules.This is because of the application of improved apriori algorithm in proposed RBS-IA framework.Improved rule mining algorithm integrates the both traffic control and decision making system to enhance the usage of internet and then it decides the application of rule based on traffic condition states that in turn reduces the time taken to execute.The RBS-IA framework reduced the execution time by 14% when compared with the SNTMAS[1] and 36% when compared with the WMST[2] respectively.Finally, it is being observed that the occurrence of network traffic data is monitored and analyzed using the improved apriori rule mining and derived the set of logical rules based on the traffic condition states.

Figure 5 .
Figure 5. Measurement of execution time.

Table 1 .
Sample network traffic dataset.

Table 2 .
Derived set of logical rules with confidence and support.