Early Detection of QoE Deterioration in HTTP Adaptive Streaming

Apart from video rate (or requested bitrate), Mean Opinion Score (MOS) has increasingly become a primary term representing Quality of Experience (QoE) in HTTP adaptive streaming (HAS). By monitoring this metric, QoE management can effectively maximize QoE for the users. However, due to various behaviors of particular commercial HAS players, deciding an appropriate monitoring interval has not been fully investigated yet. In this paper, an optimal interval is proposed to be equal to duration of a video chunk in order to aid service managers in early detecting QoE deterioration and limiting the probability of video rate deterioration. The optimal monitoring interval is evaluated by comparing with other values of interval in terms of ratio of video rate deterioration. Furthermore, MOS-based QoE monitoring method which takes into account the proposed interval is thus compared with video rate based monitoring method. The results show that with optimal interval, MOS monitoring guarantees a low ratio of video rate deterioration (around 10% for buffering state and 40% for steady state) and small average CPU Load (about 11.45%).


Introduction
Video has become the most dominant application on the Internet.According to [1], video traffic is predicted to account for about 90 percent of global IP traffic by 2019.Recently, HTTP adaptive streaming is introduced as a promising delivery technique for Over-The-Top (OTT) service providers such as Youtube, Netflix, etc.The market of OTT service has been rapidly expanded than ever, thus, it is necessary to maintain particular video quality level for the users.In indicator.Thereby, the monitoring interval turns out to be large enough to avoid high computing cost and small enough to early detect video rate deterioration.
Unfortunately, such monitoring interval has not yet been carefully taken into account.In effort to establish playback buffer as well as maintain it during the streaming session, HAS player always changes its behavior from buffering state to steady state.Thus, it is not easy to determine an appropriate interval for MOS monitoring by which desirable video rate could be maintained.Relying on the behavior of playback buffer, in this paper, an optimal interval is deduced as equal to the size of a video chunk (Each type of commercial HAS player could define different size for video chunk).As a result, the probability of video rate deterioration is equal to the smallest value of 10% within steady state, whereas, the average CPU Load is about 11.45%.
The rest of the paper is organized as follows: Section 2 will provide an overview of background knowledge; Section 3 will state related works; The proposals will be described in the Section 4; Meanwhile, Section 5 will show the evaluation results on the proposed method; Section 6 will conclude the paper and future work will also be stated.trate Selection (ABR) [5].The general framework of ABR composes of three subcomponents: resources estimation, chunk request scheduling and adaption as shown in Figure 1.Meanwhile, Figure 2 shows that there are two main states during HAS session: buffering and steady state.At the buffering state (or convergence time), HAS player attempts to establish playback buffer as quickly as possible by continuously requesting video chunks from the lowest video rate.

Background Knowledge
Once a certain amount of content is either downloaded or the playback buffer reaches a predefined target (let say as max B ), then the steady state (or periodic download) is activated.In this phase, HAS player attempts to maximize video rate by keeping playback buffer stable at max B .To do so, the player is required to download a chunk and then pause for a short time before downloading the next chunk.The download period and pause period are called ON and OFF period, respectively.Note that when stimulus occurs (in this paper, a stimulus is understood as available bandwidth reduction), the buffering state will be   ( )

Resources Estimation
where V is equal to video chunk size (in second).The commercial HAS players have different values of V.

Related Works
In order to maximize video rate, the perceived video quality defined by QoE has to be frequently monitored.According to [9], there are QoE influence factors categorized into technical and perceptual groups.In this paper, QoE which is referred to technical category, particularly to adaption logic is considered.There are three parameters possibly considered as QoE monitoring indicator: video rate, playback buffer and QoS (referred by MOS via QoS/QoE model).
In [10] [11], the video rate was monitored to evaluate the performance of their proposed video quality adaption scheme.By monitoring video rate, they confirmed that their proposed scheme successfully improved QoE.In [8], the authors stated that it always takes time for the video rate to adapt network condition.
Consequently, the large deterioration time of the video rate caused a late control action.The similar consequence could be found in playback buffer-based monitoring method [12] [13] due to the fact that the playback buffer is also captured based on chunk by chunk basis.QoS referred by MOS was effectively applied as  [17], including our study [18].Literatures showed that with MOS, QoE management could be done in automatic and accurate way.Furthermore, if diversity in users' rating and psychological factors are assumed to be ignored, MOS is well-suited for quickly predicting video rate deterioration [19] [20].However, when the monitoring interval of estimated MOS is too small, CPU load of Controller (where QoE management is deployed in) becomes higher.Moreover, QoE control in some cases will be triggered in incorrect time due to time varying characteristic of available resource (e.g.available bandwidth).It leads to a demand of MOS optimal monitoring interval.
In this paper, the optimal monitoring interval is proposed to be equal to the size of video chunk.With the optimal monitoring interval, MOS monitoring method is expected to early detect video rate reduction with low computation cost.
As the result, the control action is performed in early fashion.The effectiveness of the optimal interval is evaluated through evaluation criteria: ratio of video rate deterioration, average CPU load, detection time, and recovery time.

MOS Estimation Model
Mean Opinion Score (MOS) is defined as the "value on a predefined scale that a subject assign to his opinion of the performance of a system" and it is understood as the average of evaluation scores across subjects [20].According to [21], MOS could be used with either 9-scale measure or 5-scale measure.There are three methods to assess perceived video quality in HAS video streaming.The first is called as subjective assessment in which subjects are asked to provide their evaluations under subjective MOS of the video that they watched.This method accurately represents the perceived video quality.However, it is high cost in term of time and human resources method.Furthermore, it cannot be used as a real time QoE assessment method.Objective assessment is considered as the second method in which MOS is calculated throughout related equations.Even though being a low cost method, the accuracy of this method is quite low.

Proposed Optimal Monitoring Interval
The aim of proposing an optimal interval is to maximize QoE by avoiding video Journal of Computer and Communications rate deterioration when stimulus occurs.According to [5], video rate selection can be represented as a function R(t) according to which the video rate is selected.A typical R(t) takes various parameters as input, for example, available bandwidth, playback buffer, power level, etc.In this research, the available bandwidth and playback buffer were investigated in order to determine the optimal monitoring interval.Other parameters are out of scope.Some commercial HAS players apply available bandwidth-based estimation in their video rate selection.Traditionally, the estimation is done by per-chunk mechanism which shows a large variation.To overcome this problem, running average of available bandwidth is taken into account.Let t C denote the system capacity (available bandwidth) at time t, t R is video rate at time t.To ensure that video rate will not decrease, the following equation is given out: Moreover, in order to guarantee that an expected encoding rate will be requested by HAS player, t C should meet the condition in Equation ( 4) (This equation actually is transformed from Equation ( 1)): ( ) Once those conditions in Equation (3) and Equation ( 4) are met during a streaming session, video rate will be maximized.However, the threshold has not been clearly identified by various HAS proprieties HAS.Instead of determining that threshold, other condition related to playback buffer is considered.A simple experiment was performed to determine the behavior of playback buffer and video rate when stimulus occurs.
The experiment's scenario is as follow: a HAS player is playing a movie with high video rate under good network condition in which available bandwidth is high (around 5120 Kbps), packet loss, delay and jitter are negligible.Observing behaviors of both buffer occupancy level and video rate when: 1) The available bandwidth is dramatically decreased to 1024 Kbps at t = 20 s (before playback buffer reaches max B ).
2) The available bandwidth is dramatically decreased to 1024 Kbps at t = 60 s (after playback buffer reaches max B ).
There are two metrics were considered in this experiment t_delay_buffer, and t_delay_bitrate, that is, the duration time until the first adaption (or first change) of both playback buffer and video rate, respectively.The details of experimental setup are described as follows: there were three major entities including a client, a streaming server and a router.Microsoft smooth streaming player and a packet sniffer (Wireshark) were deployed at client, whereas Wireshark allows us to capture and analyze the traffic come from and to HTTP server offline.The router namely WAN Emulator is capable of controlling the available bandwidth of the client.During the experiment, the video rate was derived from HTTP GET packet header, whereas, the playback buffer was calculated through the Equation Based on the Equation (2) in the background knowledge section, then we have: Thus, ( 4) and ( 6) now become the condition to prevent video rate from deteriorating within streaming session.It means that the stimulus has to be captured within ( ) or before HAS player sends the next request.Therefore, the optimal monitoring interval is proposed to be mon t V = .

Evaluation
The purpose of this evaluation is to verify how elaborately the proposed interval facilitates maintaining the video rate level when the network condition is getting worse.More concretely, since the optimal interval of MOS monitoring is applied, the following metrics has been evaluated: installed.The server and the client were located in different broadcast domains and they were connected via the router.The network topology used for the experiments is shown in Figure 3.In addition, Wireshark, which is a network packet analyzer, installed at the router captured the HTTP request from the client.Note that MSS applies the value 2 s of V during streaming session [8], thus, in our experiment, the optimal interval of 2 s was evaluated.
2) Stimulus is generated in buffering state and steady state by decreasing Figure 3. Experimental setup for evaluating the optimal monitoring interval throughout three evaluation metrics.Journal of Computer and Communications available bandwidth on purpose to make the network quality deteriorated (from 5120 Kbps to 1024 Kbps).
3) The packet loss, delay and jitter in the network and average CPU load in Controller (where QoE monitoring and QoE control are performed) are observed.
4) The deterioration is detected by observing the estimated MOS.
5) The available bandwidth to the user is immediately increased to recover the network quality when the deterioration of video rate is detected (from 1024 Kbps to 5120 Kbps).
Ratio of video rate deterioration is determined by ratio of the number of times the video rate decreases to the total number of times the experiment is repeated.
Meanwhile, average CPU load stands for means of CPU load of the Controller in each experiment's iteration.Particularly, with each value of mon t , the above procedure was repeated 10 times in total.Given that within 10 times, there is n times the video rate decrease ( 10 n ≤ ), even though control action has already been generated.Then, the ratio of video rate deterioration which is the ratio of n to 10 times of total was calculated for each value of mon t .Alternatively, the av- erage CPU load of the Controller for each interval was also recorded.
The Figure 4 compares the ratio of deterioration of video rate according to the monitoring interval varying from 1 s to 3.5 s with both buffering state and steady state.It is clear that those ratios significantly increased when mon 2 s t > .
Overall, a much higher percentage of video rate deterioration could be seen in buffering state in comparison with steady state, and buffering state experienced the faster growth of such ratio.As explained in background knowledge section, during the buffering state, HAS player attempts to fill the playback buffer as quickly as possible.Whereas, during the steady state, buffer occupancy is stable at max B .Therefore, video rate becomes more sensitive to stimulus within Particularly, during the buffering state, an increase trend clearly could be seen in ratio of video rate deterioration when the monitoring interval was higher than 2 s.A slight fluctuation was found in range of between 1.5 s and 2 s.However, such fluctuation did not always occur when the whole procedure was repeated several times.Interestingly, the ratio reached to peak of 100% of video rate deterioration when monitoring interval larger than 3.2 s.
When monitoring interval was varied from 1 s to 2 s during steady state, the ratio of video rate deterioration was stable at lowest value of 0.1 of accuracy.
However, when the monitoring interval was larger than 2 s, the ratio of video rate deterioration quickly rocketed to 0.6 of accuracy before witnessing a large fluctuation in range of between 2.5 s and 3.5 s.This fluctuation was also explained as the result of limitation of our QoE management algorithm performance.The algorithm frequently called PSQA model (written in MATLAB) by which it could generate some "spike" in Controller's processing time.Actually, this abnormal fluctuation could not be seen when the experiment procedure was repeated several times.
The reasonable decrease trend of average CPU load was found from the graph.
The smaller monitoring interval, the higher average load over time.In this experiment, the performance of computer was not too high and the number of users was small, then average CPU load was not a big problem.However, it will become more serious when the numbers of the users are extremely large.Interestingly, the line of average CPU load crossed by the line of ratio of video rate deterioration for the steady state at the point of interval of 2 s at which the average CPU load was equal 11.45% and the ratio of video rate deterioration was about 0.1.
For the detection time and recovery time criteria, MOS monitoring with defined optimal interval was compared with video rate-based method.The experimental procedure for two scenarios of the evaluation was as follows: 1) A client starts watching a streaming video content.
2) The available bandwidth is reduced on purpose to make the network quality deteriorated.
3) The packet loss, delay and jitter in the network are observed.
4) The deterioration is detected by observing the video rate and the estimated MOS.
5) The available bandwidth to the user is increased to recover the network quality when the deterioration of the video rate (for the first scenario) and estimated MOS (for the second scenario) are detected.
Initially, the capacity of the link from router to server was set to 5120 Kbps.
Because there was only one client in the network, thus, the link capacity was equivalent to the available bandwidth of the client.The experiment time was 120 Journal of Computer and Communications seconds for each scenario.At t = 20 s, t = 60 s and t = 90 s, the available bandwidth of the client was set to low level of 1024 Kbps.During streaming sessions, video rate was continuously captured, whereas, the estimated MOS was monitored in every mon 2 s t = .
Figure 5 and Figure 6 show the results of experiment in both scenarios.As seen from both graphs, the video rate reached its highest value of 2962 Kbps at around t = 10 s.In Figure 5  s, and even kept staying at that level, although the router had increased the available bandwidth to 5120 Kbps.To make matters worse, when the available bandwidth was reduced to 1024 Kbps t = 90 s, the video rate started decreasing more.
In Figure 6, after the available bandwidth was reduced to 1024 Kbps at t = 20 s, t = 60 s, and t = 90 s, MOS quickly decreased to around 2.75.Those deteriorations were respectively detected at t = 23.90 s, t = 62.90 s and t = 92.90s.The router managed to increase the available bandwidth to 5120 Kbps, and thus the estimated MOS also quickly returned to 5 at t = 26.90s, t = 65.90 s and t = 95.90 s.Unlike in the Figure 6, any worse deterioration in video rate could not be seen until t = 95.69s.But the video rate just deteriorated for a short time from t = 95.69s to t = 98.49s, then recover to the original highest value.This is because the estimated MOS detects the network quality change quickly, then the available bandwidth can be adjusted immediately.
It could be seen that that the video rate always takes a large delay to adapt the available bandwidth compared to the estimated MOS.This is because the video rate does not change after detecting the network quality change.In fact, the player reacts, not to the latest fragment download throughput, but to a smoothed estimate of those measurements that can be unrelated to the current available bandwidth conditions.Particularly, in Figure 5, when the available bandwidth was decreased, the video rate deterioration could be detected about 12.46 s after that.Meanwhile, Figure 6 witnessed a short reaction time of estimated MOS.It took only about 3.9 s for capturing the deterioration of estimated MOS.It means that by using optimal monitoring interval, MOS-based method can detect video rate deterioration at least 8 s d t ≤ earlier than video rate based method.After controlling available bandwidth, the second scenario witnessed that the video rate remained unchanged or experienced a short-term reduction (observed around T = 90 s).In contrast, in the first scenario, the video rate did not return to 2962 Kbps within several seconds and it took a large r t (around 15 s) to re- turn or even did not return.This is because the playback buffer size is large enough to compensate for a negative "spike" in the available bandwidth.A small recovering time 4 s r t ≤ of video rate which could be seen from the second scenario is meaningful in QoE management.In other words, video rate has been guaranteed to be maximized or to be kept stable at desirable level.

Conclusions
In this work, an MOS optimal interval was proposed.To sum up, a condition to maximize video rate was established by taking into account playback buffer influence.In order to meet the condition, the monitoring interval was thus required to be equal to the size of video chunk (in second).By applying this interval, QoE management system could effectively maximize video rate during a streaming session.The effectiveness was represented by early detecting video P. Xuan-Tan, E. Kamioka

P
. Xuan-Tan, E. Kamioka DOI: 10.4236/jcc.2017.51400215 Journal of Computer and Communications fact, the perceived video quality is represented by QoE as the most important performance metric when video service providers are expecting to maximize the satisfaction of their users.In the last few years, contemporary researchers have proposed various QoE management models [2] [3] for monitoring and control QoE in HTTP adaptive streaming.In QoE monitoring, perceived quality is monitored based on indicators' observation.The indicators could be video rate or playback buffer or MOS.In HTTP adaptive streaming, video rate and playback buffer are typically obtained on a chunk-by-chunk basic.As such, they are always observed with long unfixed interval.In other words, observation interval depends on the time points when HAS player starts and finishes download chunks.As a result, once network condition becomes worse, control action is meaningless due to the fact that video rate has already been decreased.Without depending on chunk-by-chunk basic, MOS becomes a promising monitoring

Figure 1 .
Figure 1.ABR framework comprises of three main components: Resource estimation, request scheduling and adaption module.

Figure 2 .
Figure 2. Buffering state and steady state in a streaming session.

Figure 4 .
Figure 4. Ratio of video deterioration within buffering state and steady state.Average CPU Load was calculated across intervals.
Video rate Available bandwidth Estimated MOS Journal of Computer and Communications at t = 60 s, the video rate also lately reacted to.It decreased to 2056 Kbps at 71.99 P. Xuan-Tan, E. Kamioka DOI: 10.4236/jcc.2017.51400220Journal of Computer and Communications (2).Table 1 shows the sample dataset of experiment with two studied metric t_delay_buffer, and t_delay_bitrate.The means of the waiting time until the first negative adaptions of both playback buffer size and video rate are respectively 5.76 s and 12.69 s.Moreover, during the experiment, it was interesting to find out that the video rate usually decreases when the playback buffer degrades at least two times.The results show that playback buffer should be considered as a milestone to decide the monitoring interval of estimated MOS.It is interesting to find that playback buffer always quickly reacts to the change of network condition.Thus, it can be used to predict video rate's deterioration.

Table 1 .
Sample dataset with two metrics: t_delay_buffer and t_delay_bitrate.Journal of Computer and Communications Recovery time r t of video rate which represents the duration time from when control action is generated until video rate is recovered to expected level.
, after the available bandwidth was reduced to 1024 Kbps at t = 20 s, the video rate decreased to 2056 Kbps at t = 32.46s.Router was immediately controlled to increase the available bandwidth to 5120 Kbps.However, the video rate did not return to 2962 Kbps within several seconds.It stayed at the value of 2056 Kbps for 15 s.When the available bandwidth was decreased Figure5.Video rate requested by the user, available bandwidth and estimated MOS in the first scenario.
Figure 6.Video rate requested by the user, available bandwidth and estimated MOS in the second scenario.