_{1}

The explosion of traffic brings the challenges for Internet Service Providers (ISPs) to make a profit with the high cost of infrastructure and increased competition. This calls for economic mechanisms that can enable providers to allocate on-demand resources through the prediction of traffic volumes and adjust the price. In this paper, we analyze the network traffic pattern of mobile data and make an accurate prediction of traffic volumes through ARIMA and LSTM. Based on the analysis, we then suggest a scalable price strategy for ISPs to satisfy the various requirements of customers.

According to Cisco annual internet report, mobile users will increase to 5.7 billion, mobile connections will increase to 13.1 billion, and mobile traffic volume is estimated to reach almost one zeta-byte by 2023. Particularly, data traffic produced by smartphones accounts for 86 percent of all mobile data traffic as emergence of various mobile applications such as online chatting, mobile games, and online shopping [

Previous literature has explored pricing strategy and resource allocation problems in networking services. Low and Lapsley proposed a convergence algorithm to flow control and maximize the aggregate source utility based on the transmission rates [

Moreover, future ISP prefers to provide on-demand services for customers for profit consideration. To improve the efficiency of network resources and deliver better user experience, the accurate traffic prediction model is required for ISP to allocate “just-right” amount of resources. Moreover, the network traffic prediction model would have numerous practical applications, including maintaining a stable network, optimizing user experience, and ensuring network security.

To address the aforementioned challenges, we first analyze the usage pattern of customers with the time series model and then predict the traffic volume through Auto-regressive Integrated Moving Average (ARIMA) model and Long Short-Term Memory neural networks (LSTM). Based on the analysis, we suggest the price strategy for ISPs.

Our contributions in this paper are summarized as:

● We analyze the traffic pattern of mobile data.

● We develop an Arima-based traffic prediction model.

● We develop an LSTM-based traffic prediction model.

● We design a smart pricing strategy for ISPs to promote a temporal dynamic service package to attract more users and then make a higher profit.

The rest of the paper is organized as follows. In Section 2, we review the previous works on network traffic prediction and ISP pricing policy. In Section 3, we discuss the problem design of our study. In Section 4, we introduce the traditional statistical-based algorithm and machine learning-based algorithm for the network traffic prediction. In Section 5, we evaluate and compare the performance of the prediction models. In Section 6, we discuss the ISP pricing strategy based on the network traffic prediction model we proposed. The final Section 7 concludes the study.

Various past studies have examined the accuracy and effectiveness of numerous network traffic prediction models. The previous works can generally be divided into two categories, the statistical-based method and machine learning-based method. Iqbal has classified the previous network traffic prediction models into three categories, which are “classic time series-based predictors, Artificial Neural Networks-based predictors, and wavelet transform-based predictors” [

The resource allocation optimization problem and the pricing policy of network services have also been discussed in many past studies. Multiple previous researches aimed to optimally allocate the available resources for providing better network quality-of-service (QoS). Ahmed presented an integrated algorithm with the optimal traffic-dependent allocation rate for both high-traffic and low-traffic situations [

In this section, we first formulate traffic prediction problems by mathematical notations. Then, we use the time series approach ARIMA and deep learning approach LSTM to construct the prediction model.

Denote y t as the traffic volume at time period t. The historic traffic data fromtime t − k to time t can be represented as Y = { y t − i , ⋯ , y t } . Our objective is to predict the future traffic volume from time t + l to time t + l . Denote y ^ t + 1 as the predicted traffic volume at time t + l while y t + 1 as the ground truth traffic. The problem can be formulated as:

min ∑ i = t + 1 t + l ‖ y ^ i − y i ‖ 2 (1)

y ^ t + 1 = f ( y t − i , ⋯ , y t ) (2)

In this study, we will apply the Auto-regressive Integrated Moving Average (ARIMA) model, which is one of the most popular statistical analysis models for time series forecasting. ARIMA model is a combination of the Auto-regressive (AR) model and Moving average (MA) model. The AR model predicts future behavior based on past behavior according to the relation between a value in the present ( y t ) and previous value ( y t − k ). The auto-regression (AR) model is expressed as follows:

y t = μ + ∑ i = 1 p γ i y t − i + ϵ t (3)

where μ is a constant, p is the order, γ i is the coefficient for the lagged variable in time ( t − p ), and ϵ t is the white noise at t.

The Moving Average (MA) model shows the possibility of relation between a present value ( y t ) and residuals from previous periods. The moving average (MA) model is expressed as follows:

y t = μ + ϵ t + ∑ i = 1 q θ i ϵ t − i (4)

where μ is a constant, q is the order, θ i is the coefficient for the lagged variable in time ( t − q ), and ϵ t is the random error at t.

The definition of ARIMA model is expressed as below as a combination of the AR and MA model:

y t = μ + ∑ i = 1 p γ i y t − i + ϵ t + ∑ i = 1 q θ i ϵ t − i (5)

The ARIMA (p, d, q) model combines the p auto-regression terms and q moving average terms. The p is the order of auto-regression; d is the degree of difference, and q is the order of moving average.

This paper applies the Long Short-Term Memory Neural Networks (LSTM) in network traffic prediction in addition to the ARIMA model. The LSTM neural network is ideal for our study due to its ability to capture sequence information and learn long term dependencies. LSTM is a special Recurrent Neural Network (RNN) with the memory cell’s ability to automatically determine the optimal time lags for prediction. LSTM develops with short-term and long-term memory components to improve the traditional RNNs by solving the vanishing gradient problem [

i t = σ ( W i [ h t − 1 , x t ] + b i ) (6)

The cell state transfers and decides whether to keep or forget the relative information throughout the sequence processing.

C t = f t ⊙ C t − 1 + i t ⊙ C t ¯ (7)

The forget gate decides the optimal time lags for the prediction of the input.

f t = σ ( W f [ h t − 1 , x t ] + b f ) (8)

The output gate generates the values in the cell and calculates the yielded value of the cell.

o t = σ ( W o [ h t − 1 , x t ] + b o ) (9)

with the function of these gates, the LSTM model can determine the arbitrary time lags with long dependency for our time-series datasets. Each sigmoid layer yields a value between zero and one to determine the amount of every segment of data for output.

The dataset in this study consists of data collected from the Telecom Italia cellular network from the city of Milano, Italia. The data is obtained with a time interval of 600,000 milliseconds (10 min) over 62 days from 11/01/2013 to 01/01/2014. The short message service (SMS), call activity, and internet traffic activity is incorporated into the dataset. The spatial distribution of the cellular traffic dataset is gathered and represented in a grid with square cells. The area of Milan is composed of a grid overlay of 1000 squares, each with a size of about 235 × 235 meters [

Before performing the forecast, we observed the characteristics of our time series dataset.

the forecast. We can see that SMS, call, and Internet traffic volumes all exhibit seasonal patterns over each day. The daily pattern is useful in forecasting and obtaining a good fit (

We first work with the traditional statistical-model to predict our time series data. In determining the optimal parameters for our ARIMA model, this study uses a few estimators as reference. Akaike’s Information Criterion (AIC) is helpful for estimating the orders (p, d) of the ARIMA model. It can be written as:

AIC = − 2 log ( L ) + 2 ( p + q + k + 1 ) (10)

where k is the number of estimated parameters in the model and L is the maximized maximum likelihood of the data. The Mean Squared Error (MSE) value, which measures the average squared difference between the estimated values and the actual value, can also be used to assess the adequacy of future values to our data. MSE is defined as follows:

MSE = 1 n ∑ i = 1 n ( y i − y i ) 2 (11)

Lower AIC and MSE values indicate a better-fit model.

Then we present the performance of the selected ARIMA (1, 0, 1) model on SMS traffic prediction in

SMS | Call | Internet | ||||||
---|---|---|---|---|---|---|---|---|

(p, d, q) | AIC | Test MSE | (p, d, q) | AIC | Test MSE | (p, d, q) | AIC | Test MSE |

(1, 0, 0) | 8439.888 | 0.185 | (1, 0, 0) | 632.066 | 0.064 | (1, 0, 0) | 32,556.51 | 1.236 |

(1, 1, 0) | 7749.470 | 0.167 | (1, 1, 0) | −699.093 | 0.055 | (1, 1, 0) | 31,738.12 | 1.083 |

(0, 1, 0) | 9041.606 | 0.194 | (0, 1, 0) | 859.627 | 0.065 | (0, 1, 0) | 33,092.29 | 1.224 |

(0, 0, 1) | 14646.12 | 0.444 | (0, 1, 1) | −862.518 | 0.054 | (0, 1, 1) | 31,186.21 | 1.055 |

(1, 0, 1) | 6959.456 | 0.155 | (0, 2, 1) | 871.574 | 0.065 | (1, 0, 1) | 31,080.03 | 1.050 |

(0, 1, 1) | 7069.868 | 0.156 | (1, 0, 1) | −921.965 | 0.054 | (1, 1, 2) | 31,177.21 | 1.056 |

(2, 0, 0) | 7463.678 | 0.163 | (1, 1, 2) | −1218.69 | 0.052 | (2, 0, 0) | 31,491.65 | 1.085 |

(2, 1, 0) | 7283.491 | 0.158 | (2, 0, 0) | −793.891 | 0.055 | (2, 1, 0) | 31,333.62 | 1.030 |

(2, 1, 1) | 7039.183 | 0.156 | (2, 1, 0) | −955.677 | 0.054 | (2, 1, 1) | 31,178.38 | 1.057 |

The research uses the LSTM model in Keras which contains one LSTM layer with 128 nodes, and three Dense layers with 128, 256, 64 nodes respectively. The prediction results are shown in Figures 8-10. It is found that the LSTM is capable of predicting the traffic with a small difference between the actual and predicted value.

In this section, we will evaluate the precision and accuracy of our ARIMA and LSTM models for predicting the network traffic datasets. We will compare the models with respect to their performance in reducing error rates. We use the assessment metric of the Root-Mean-Square Error (RMSE) to measure the accuracy of the prediction model. RMSE measures the square root of the average of the residuals between the actual and predicated values, which can be computed as follows:

RMSE = 1 N ∑ i = 1 N ( x i − x ^ i ) 2 (12)

In this equation, N is the sample size; x i is the actual value; x ^ i is the predicted value. The smaller RMSE value indicates less error between the predicted and observed values. As shown in

Models | ARIMA | LSTM |
---|---|---|

SMS | 0.39 | 0.33 |

Call | 0.23 | 0.19 |

Internet | 1.02 | 1.05 |

Based on the prediction of traffic volumes, ISPs can allocate resources on-demand. Furthermore, ISPs can promote a “smart price” strategy to balance the traffic volumes in the temporal dimension. We have analyzed the percentage of SMS cost, call cost, and Internet cost in total cost respectively through the data set from Liantong ISP. From

This paper analyzes the traffic volumes of SMS, call and Internet through time series analysis. Based on the traffic pattern, we use ARIMA and LSTM models to predict traffic and achieve high accuracy. This study compares and evaluates the performance of the models for the network traffic prediction. The research finds that the deep learning-based algorithm has relatively better performance than traditional statistical-based algorithm with a higher accuracy. Our developed model could be used by ISPs in improving network management and services. This study proposes the smart price strategy for ISP to better manage the on-demand resources and dynamicly adjust the price in different time of the day. Our smart pricing scheme could help ISP to increase their profit by balancing resource utilization and providing better services.

A special gratitude I give to my instructor, Ms. Xu, for her valuable and constructive suggestions during the planning and development of this research work. Her willingness to give her time so generously has been very much appreciated.

The author declares no conflicts of interest regarding the publication of this paper.

Su, T.Y. (2021) Smart Network Price Policy for ISP Based on Traffic Prediction. Journal of Mathematical Finance, 11, 1-14. https://doi.org/10.4236/jmf.2021.111001