^{1}

^{*}

^{1}

Network congestion, one of the challenging tasks in communication networks, leads to queuing delays, packet loss, or the blocking of new connections. In this study, a data portal is considered as an application-based network, and a cognitive method is proposed to deal with congestion in this kind of network. Unlike previous methods for congestion control, the proposed method is an effective approach for congestion control when the link capacity and information inquiries are unknown or variable. Using sufficient training samples and the current value of the network parameters, available bandwidth is adjusted to distribute the bandwidth among the active flows. The proposed cognitive method was tested under such situations as unexpected variations in link capacity and oscillatory behavior of the bandwidth. Based on simulation results, the proposed method is capable of adjusting the available bandwidth by tuning the queue length, and provides a stable queue in the network.

A data portal provides information from diverse sources in a unified way. It enables instant, reliable and secure exchange of information over the web; in particular, a data portal focuses on providing centralized, robust access to specific data and supported manipulations. The concept of a portal functions to offer a single web page that aggregates content from various servers.

There are different types of data portals, for instance, academic portals, including those for scientific data; commercial portals; and enterprise portals. A data portal can be considered as an application-based network that consists of databases, different servers, web-based application software, communication links, and computing clusters.

With regard to a data portal, congestion can happen when a link or node carries so much data that a loss of quality of service for the portal results. As an early effort to control the network congestion, the Jacobson’s algorithm [

Conventional congestion control methods often cannot achieve both fairness and appropriate bandwidth utilization due to packet loss. To deal with the problem, various TCP parameters have been utilized for the estimation of the available link capacity and the Round-Trip Time (RTT) in order to predict congestion [3-7].

When a delay-bandwidth product grows, the TCPbased networks exhibit an oscillatory behavior under some congestion-control algorithms. Reference [

The main obstacle in TCP is related to its reliance on scarce events that provide poor resolution information.

To improve adaptation to network conditions, achieve high utilization, attain stable throughput, and decrease standing queues in the network, some approaches have been proposed in the literature [13-20]. Explicit Congestion Control (XCC), one of the famous congestion control approaches, is able to inform sources concerned with the network status and control the bit rate in network. The XCC uses a header to carry the throughput information and Round-Trip Time (RTT) of the flow to which the packet belongs. When the throughput is used for the adjustment of bandwidth distribution, the RTT enables sources to control the speed of adaptation to network conditions. In XCC, routers play an important role in informing sources concerned with the network status and in helping sources to control their bit rate by accurate feedback. In fact, to determine the feedback for sources, a router should calculate the current spare bandwidth for outgoing links and compute the link capacity.

Some congestion control methods need explicit and precise feedback. As congestion is not a binary variable, congestion signaling should provide the congestion degree. By means of precise congestion signaling, it is possible to determine when the network tells the sender the congestion state and how to react to it. In fact, the senders can decrease their sending windows quickly when the bottleneck is extremely congested. However, these methods—based on a control loop with feedback delay—become unstable for long feedback delay. To deal with this effect, the system should slow down while the feedback delay increases. In other words, when delay increases, the sources should change their transmission rates more slowly [8,21-23].

As one of crucial issues related to network congestion, robustness of the method should be independent of unknown and quickly changing parameters (e.g., the number of flows). Also, for such methods as XCC, convenient bandwidth sharing is difficult when the information inquiries and capacity of links are variable. In other words, the unpredictability of the network creates a problem for XCC. This study focuses on a cognitive method to control congestion; it also can perform well when the link capacity and information inquiries are unknown or variable.

A cognitive system is a complex system that has the ability for emergent behavior [

To optimally adapt the network parameters and to provide efficient communication services using a cognitive approach, learning the relationships between parameters of network is crucial. In learning phase, it is possible to utilize the Bayesian Network (BN) model. A BN is a probabilistic graphical model that represents conditional independence relations between random variables by means of a Directed Acyclic Graph (DAG) [

The BN model can be used to provide a representation of the dependence relationships among network parameters and adjust cognitive parameters to improve the network’s efficiency. It is utilized to deal with congestion, one of the challenging tasks in the TCP; there is no efficient mechanism to determine when congestion occurs in the network.

As mentioned earlier, to efficiently control the network congestion, and preserve stable throughput, low queuing delay, the critical parameters in network can be defined and adjusted based on pre-defined criteria and statistical variations of the network.

The available bandwidth, , which is distributed among different flows during a certain time period T, is defined as follows:

where coefficients and are constant, x(t) is the bandwidth utilized for the last period T, C is the estimated capacity of the data transmission link, and Q(t) is the minimum queue length that happened during the last T seconds.

The parameter T can be written as, in which is the system base delay, or the delay excluding queuing delay; and is the actual capacity of the data transmission link.

The capacity C is a function of various factors, such as the data rate of every link, the number of active links, failed transmissions, the number of collisions, and handshake procedures. The estimation error of link capacity is defined as. The given error should be compensated up to a certain limit. To define the limit, the parameter C in (1) is replaced by. When the capacity of the data transmission link is fully utilized, i.e., , it is expected that the available bandwidth is zero or close to zero, due to error. Therefore, the limit of the estimation error is defined as the following:

The value of depends on the available bandwidth and the standing queue in the router. In fact, if the link capacity changes, the F_{available} can be adjusted to distribute the bandwidth among the active flows.

The proposed method computes the F_{available} with no knowledge of exact channel capacity. It also can adjust the F_{available} according to bandwidth variations.

Typically, a router controls each of its output queues; therefore, available bandwidth is computed for each of them. With the proposed method, in order to compute available bandwidth, it is not required for the router to be configured with certain medium capacity. In addition, the proposed method can adapt to changing bandwidth conditions over time.

First, the effect of queue speed on available bandwidth F_{available} is considered. The queue speed can be defined as the difference between the capacity of the transmission channel and utilized bandwidth during the time period T. Equation (3) is written as follows:

where is the queue speed. Due to queue variationsthe queue length should be adjusted for F_{available}, so parameter α is defined. It is possible to conveniently tune F_{available} using the parameter α during extreme queue variations. The parameter α is adjusted by the cognitive algorithm. In fact, the parameter α controls the target queue length in which the network stabilizes.

The schematic of the cognitive congestion control is illustrated in

During the observation step, required information from network is collected. Then, the cognitive algorithm learns the relations between the parameters and their conditional independences as well as the effect of controllable parameters on observable parameters.

During the decision step, the values to be assigned to controllable parameters are calculated to meet pre-defined requirements. In other words, the values of the

network parameters of interest are predicted based on the observations. This prediction is done by inference, using the Bayesian network.

In the action step, the controllable parameters are tuned, and the appropriate actions are taken in the network.

During the observation step, seven network parameters are examined. These parameters are:

1) The Round Trip Time (RTT), that is, time period for which a signal to be sent plus the time period for which an acknowledgment of that signal to be received;

2) The queue length;

3) The queue speed;

4) The throughput, that is, total amount of successful delivered data over a link;

5) The contention window size;

6) The congestion window size, that is, the total amount of unacknowledged data;

7) The congestion window status.

The congestion window status is considered as 0 if the congestion window size at time t becomes 25% less than the congestion window size at time t − 1; otherwise the status is 1. The status equals to zero is of interest, as the congestion is being decreased.

Here, observed network parameters are considered as random variables (x_{1}, ···, x_{7}). It is assumed the given variables have unknown dependence relations. The independent samples from every variable have been gathered into the input matrix (size of n × 7). The construction of input matrix is performed during the observation step.

The learning step is a key step in the cognitive algorithm. During this step, the BN is built to provide a structure representing conditional independence relations between parameters of interest in a DAG. To form the BN and demonstrate the relations in a DAG, learning from the qualitative relations between the variables and their conditional independences is considered.

A node in the DAG represents a random variable, while an arrow that joins two nodes represents a direct probabilistic relation between the two corresponding variables. For, if there is a direct arrow from j to i, node j will be a parent of node i. (describes the set of parents of node i). A complete DAG with all nodes connected with each other directly can represent all possible probabilistic relations among the nodes.

During the learning phase, based on the input matrix (Im), the dependency is exploited among the variables represented as nodes in a DAG. To build the DAG representing the probabilistic relation between the variables, the selection of DAGs and the selection of parameters are utilized.

2.3.2.1. Selection of DAGs For the selection of DAGs, the scoring approach and the constraint approach can be utilized [29,30].

In the constraint approach, a set of conditional independence statements is defined by a priori knowledge. Then, the given set of statements is utilized to build the DAG, following the rules of d-separation [

The scoring approach generally is utilized when a set of given conditional independence statements is not available [31,32]. The scoring approach is capable of inferring a sub-optimal DAG from a sufficiently large data set (i.e., Im). The scoring approach consists of two phases: 1) Searching to select the DAGs to be scored within the set of all possible DAGs and 2) scoring each DAG according to how accurately it defines the probabilistic relations between the variables based on the Im.

The searching process to select the DAGs (i.e., the first phase of the scoring approach) is required because it is not computationally efficient to score all the possible DAGs, since the scoring procedure generally takes a great deal of time. For instance, to find the DAG with the highest score for a set of m variables, the following formula is expressed [

where is the total number of possible DAGs. When m increases, the increases significantly, and the scoring procedure takes more time. Therefore, a searching process is required to choose a small, and possibly representative, subset of the space of all DAGs.

Most of searching processes in scoring approaches are based on heuristics that find local maxima almost appropriately. However, the heuristics do not generally guarantee that global maxima is obtained [

There are two classical searching procedures in literature [

Hill Climbing is an iterative algorithm by which an arbitrary solution is initially defined for a problem. Then, the hill climbing algorithm searches a better solution by incrementally changing a single element of the solution. If the change generates a better solution, an incremental change is made to the new solution; this is repeated until no further improvements can be reached [34,35].

Markov Chain Monte Carlo (MCMC) is a category of algorithms for sampling from probability distributions based on constructing a Markov chain that has the distribution of interest as its equilibrium distribution. After specific procedure, the state of the chain is utilized as a sample of the distribution of interest [36,37].

The searching process results in some DAGs.

The Bayesian information criterion is selected for scoring, and is based on the maximum likelihood criterion. The Bayesian information criterion is expressed as follows [

where Im is the dataset (i.e. input matrix), A is the DAG to be scored, is the maximum likelihood estimation of the parameters of A, and n is the number of observations for every variable in the dataset.

When all random variables are multinomial, the Bayesian information criterion is formulated as follows [30- 33,38]:

where is a finite set of outcomes for every variable; is the number of different combinations of outcomes for the parents of; is the number of cases in the input matrix in which the variable took its kth value (k = 1, 2, ···, O_{i}), and its parent was instantiated as its jth value (j = 1, 2, ···, C_{i}); and is the total observations related to variable in the input matrix

(Im) with parent configuration j (i.e.,).

Therefore, based on Equation (6), the scoring approach is computationally tractable. More details about Bayesian information criterion are presented in [

Now, the DAG with highest score can be selected.

2.3.2.2. Selection of Parameters During the selection of parameters, the best set of the controllable parameters are chosen and estimated, based on the observed parameters and their independence relations.

Based on the Bayesian network definition, every variable is directly calculated by its parents; thus, the estimation of the parameters for every variable x_{i} is performed according to the set of its parents in the DAG selected during structure learning. The Maximum Likelihood Estimation (MLE) technique is used to build a predictive model and to estimate the appropriate set of parameters describing the conditional dependencies among the variables. The MLE technique is expressed as follows:

For, the parents of node i are in the configuration of type j, and the variable takes its kth value (i.e.).

The estimated value provides an approximation of the posterior distribution of given the evidence j (i.e., parents of node i in the configuration of type j). Therefore, Equation (7) can be re-written as follows [

The Bayesian network is completed after selection of DAGs and parameters in the learning step. The completed BN provides the probabilistic relations among selected parameters from the selected DAG.

In this step, the future values of the queue length and queue speed—that is, the unobserved parameters—are predicted based on selected observed parameters. The estimated value of unobserved parameter is defined as the expectation of the given parameter, using probability function represented in Equation (8). Therefore, the expected value of at time t, , is calculated as follows:

where is the actual value of the unobserved parameter at time t, and evidence is the set of selected observed parameters.

To calculate α in Equation (3), the predicted values of the unobserved parameters (i.e., queue length and queue speed) are considered. In fact, the fluctuation of predicted values for the queue length and queue speed are utilized to set the parameter α; then, parameter α adjusts the available bandwidth, F_{available}.

As mentioned earlier, α represents the target queue length in which the network stabilizes. When there is no queue constructed (underutilization), α explains how much bandwidth is distributed in every control interval. During full utilization or overutilization, α will control how much queuing delay is introduced.

During the time of underutilization, the bandwidth is maximally distributed; if a link is saturated, the queuing delay is significantly decreased. Generally, α is high during underutilization, and is low during full utilization.

The base scenario used in the simulation includes a dumbbell network topology, which provides a number of nodes connected to a single router. The router is connected to another router over a serial link. A group of nodes are connected to that router, creating the dumbbell topology.

The network traffic consists of flows between the client and server nodes in both directions. It is assumed that the flows traversing the network from server nodes to client nodes are downloads, while flows in the opposite direction are considered uploads.

The simulations were performed using the ns-3 network simulator [

In this part of the procedure, the response of the proposed method to unexpected variations of link capacity was emphasized. During this simulation, the data rate changed. At first, the simulation was performed by the data rate of 56 Mbps. The variable capacity was simulated by changing the data rate, as shown in

Due to sudden bandwidth reduction, there are queue spikes in the

To demonstrate the responsiveness of the proposed method to arrival and departure flows, a 40-sec simulation was performed, and the RTT was set to 60 ms. The average queue length as well as the parameter α throughout the time are illustrated in

When the queue is reduced, that is a sign of underutilization, and α is increased. During the increase of α, more bandwidth is distributed among servers to quickly provide full utilization.

To match the variation of the queue, the queue length was increased, while parameter α was decreased. Generally, there was a low latency caused by queue buildup.

To prevent high queue spike, the maximum value of α should be less than the maximum value for queue length (i.e., channel capacity). Parameter α can tune the variation of bandwidth as it affects the queue.

In this part of the procedure, the response of the method was assessed while different data rates are used in network. It is considered that a part of the network has a data rate of 10 Mbps and the rest of the network has the data rate of 56 Mbps. In other words, new flows enter the network with data rate of 10 Mbps; other flows with data rate of 56 Mbps leave the network, or vice versa. It causes an oscillatory behavior for the bandwidth. The proposed method provides a stable queue under the given situation (

Now, the efficiency of the method is evaluated as network utilization. It is demonstrated that the increase of the bandwidth-delay product of network negatively affects the efficiency of the TCP; however, it has trivial influence on the efficiency of the proposed method.

To simulate a traffic pattern, two kinds of flows are considered: 1) flows with exponentially distributed duration, with certain minimum value (1 s) and mean value (10 s); and 2) other flows that are active during the simulation.

Each wired path between the end-system and router was configured with a specific latency; latencies of wired paths were between 20 ms and 120 ms. The growth of the bandwidth-delay product of network was simulated by increasing the path delay.

The result of simulation is shown in

It can be demonstrated that the TCP was not able to scale with the bandwidth-delay product of network because of its fixed dynamics. Based on the traffic pattern and the number of flows, the TCP was not able to fully utilize network resources for a specific bandwidth threshold.

Overall, the proposed method was able to maintain convenient utilization at all times.

To predict the status of congestion in future (i.e., t + k) at time t, the current value of all parameters of interest was considered. It is possible to predict when the congestion happens, and try to act before it affects the network.

To analyze the accuracy of the learning process for predicting congestion at time t, the value of Status (t + k)—that is, the presence or absence of congestion at time t + k, with k ≥ 1—was considered.

The performance of the learning process is assessed as a function of the size (i.e., number of samples) of the training set utilized to learn the relations between the desired parameters. The parameters are stored during the training, and the stored values become the input for the inference phase.

In Figures 6 and 7 the training set size changes. The vertical axis of the figures represent the average error for the inference, i.e. the expected value of , for which is the actual value of congestion status at time t + k and

is the predicted value of congestion status at time t + k. When congestion is present, the variable of Status is zero, otherwise it is one. This variable can be illustrated as the frequency of an error in the process of prediction. In Figures 6 and 7, two cases are separately assessed. In other words, the results are shown for and.

In

In

In this paper, a cognitive method is proposed to improve bandwidth sharing and deal with congestion in a data portal. For example, when the data portal is about climate change data, congestion control is more emphasized because the scientific climate data is voluminous; there is high traffic to/from the data portal by the scientific community, research groups, and general readers. In fact, this study was performed to improve congestion control in such data portals as the climate change portal.

Here, the data portal is considered as an applicationbased network. The proposed method was able to adjust the available bandwidth in the network when the link capacity and information inquiries were unknown or variable. In fact, it was possible to conveniently adjust available bandwidth, using the cognitive method, during extreme queue variations.

The variation of link capacity has an influence on the queue. In fact, α dynamically changes over the time, and helps the queue to have a smoother behavior while guaranteeing that the set is based on pre-defined operating conditions.

The learning phase is a key step in the cognitive method. During this step, the collected information in the observation phase is used by the Bayesian network model to build a probabilistic structure to predict variations of queue length.

The efficiency of proposed method was tested by a network simulator. Based on results, available bandwidth during extreme queue variations can be conveniently adjusted by the proposed method. Unlike TCP, in which the growth of the bandwidth-delay product of network affects negatively TCP’s efficiency, the proposed method is able to maintain convenient utilization at all times.

This work was conducted as a part of an Innovation Working Group supported by the Nevada EPSCoR Programs, and funded by NSF Grant # NSFEPS-0814372.