Enabling IoT Network Slicing with Network Function Virtualization

Numerous Internet of Things (IoT) devices are being connected to the networks to offer services. To cope with a large diversity and number of IoT services, operators must meet those needs with a more flexible and efficient network architecture. Network slicing in 5G promises a feasible solution for this issue with network virtualization and programmability enabled by NFV (Net-work Functions Virtualization). In this research, we use virtualized IoT platforms as the Virtual Network Functions (VNFs) and customize network slices enabled by NFV with different QoS to support various kinds of IoT services for their best performance. We construct three different slicing systems including: 1) a single slice system, 2) a multiple customized slices system and 3) a single but scalable network slice system to support IoT services. Our objective is to compare and evaluate these three systems in terms of their throughput, average response time and CPU utilization in order to identify the best system design. Validated with our experiments, the performance of the multiple slicing system is better than those of the single slice systems whether it is equipped with scalability or not.

IoT network slicing is enabled by NFV based on the MANO framework. We run virtualized IoT platforms as our VNFs and customize network slices through our NSD (Network Service Descriptor) to support IoT services of various QoS.
We propose to customize each network slice first with different bandwidth to handle different types of IoT services. In addition, because of the virtualization of IoT platforms, we can scale out/in their instances rapidly and dynamically to support the variation in service load [8]. Hence three different slicing systems are constructed for our research. The first system consists of only a single network slice for all IoT services, while the second one consists of three customized slices to handle each type of IoT traffic separately. The last system is similar to the first one but can scale out/in VNFs on the slice. To evaluate the performance of each system, we design a Traffic Generator to simulate three types of IoT services with different bandwidth requirements. We show the advantages and disadvantages of each system and articulate performance tradeoffs by analyzing their pros and cons.
The rest of the paper is organized as follows: Section 2 introduces the background information of oneM2M, ETSI NFV architectural framework and network slice. Section 3 presents our system design and system workflow in Open-Stack. Section 4 describes three different systems and compares their performance evaluation. Section 5 presents our system design, implementation and evaluation in Kubernetes. Finally, Section 6 shows the conclusion and future work of this paper.

Background
In this section, we explain the oneM2M IoT platform we used in our system, the NFV architectural framework and the concept of network slicing.

ETSI NFV Architectural Framework
NFV MANO (Management and Orchestration) [11] is a framework developed by ETSI (Europe Telecommunication Standards Institute) for the management and orchestration of all virtualized resources including compute, network, storage,   It is also responsible for managing the life cycle of VNFs, such as instantiation, scaling, update and termination. In addition, it manages the policy of network services, the collection and transfer of performance measurement, and the allocation of resources related to infrastructures. The NFV MANO framework is adopted in our research to construct the network slicing environment.
As illustrated in Figure 2, NFV MANO consists of three major components. • NFV Orchestrator (NFVO), which is in charge of the lifecycle of Network Services (NS) and responsible for onboarding Network Service Descriptor (NSD).
• VNF Manager (VNFM), which is responsible for the lifecycle management of the VNFs including VNF scaling out/in and their performance and fault management.

Network Slice
According to 3GPP [1], network slice is an end-to-end network architecture. It consists of multiple network slice subnets. Each network slice subnet represents different components in an end-to-end network such as access network, core network, transport network. NFV MANO, as the key enabling technology of network slicing, maps each network slice subnet to an NS in MANO. Each NS is defined by an NSD that consists of a set of attributes and several constituent descriptors including VNF descriptors (VNFDs), VL descriptors (VLDs), and VNFFG descriptors (VNFFGDs) [12]. The attributes of NSD are used to specify how NS instances should be deployed. Network slicing enables the operator to divide a physical network into multiple virtual and logically independent end-to-end networks. Each network slice is tailored to fulfill different service requirements, such as delay, bandwidth, security, and reliability to cope with diverse network application scenarios [13]. Mobile operators can use network slicing to provide customized 5G networks to various vertical services based on specific needs of each [14].
• OpenStack is an open source cloud operating system for virtualizing and managing resources including compute, network and storage. It provides multiple managing services such as Nova for compute, Neutron for network [20], Cinder for storage, Keystone for key management, Horizon for dashboard and Heat for orchestration. There are other orchestrators that can be used as NFVO and VNFM, such as Open Network Automation Platform (ONAP) [21], Open Source MANO (OSM) [22] and Open Baton [23]. Since Tacker is an official OpenStack project, compared to other open sources, it is highly compatible with OpenStack VIM. Moreover, its design has the advantage of simplicity that allows users to easily deploy and operate. Therefore, we adopt Tacker as NFVO and VNFM in the NFV MANO framework.
In our system, OM2M IN instances are deployed as the VNFs. On the other hand, NSs are the composition of OM2M IN instances and a Load Balancer to be introduced next.
We construct the three slicing systems for our research experiments to support IoT services including: 1) a single slice system, 2) a multiple customized slices system and 3) a single but scalable network slice system. Our objective is to compare and evaluate these three systems in terms of their throughput, average response time and CPU utilization in order to identify the best system design.

System Architecture
We first explain the functional blocks of our three systems, then show the Network Service lifecycle management flows and the system workflows. The general architecture of our systems is illustrated in Figure 3 and our three systems are depicted in Figure 4.
Note that we design three new system components Master Node, Load Balancer and Traffic Generator on top of OpenStack and Tacker open sources in order to complete our systems.
• Master Node is incorporated in VNFM to monitor the CPU status of VNFs on each network slice in order to trigger scale-out or scale-in actions [24].
When the average CPU usage of VNFs exceeds an overload threshold, it will trigger the scale-out of the VNFs on the network slice. On the other hand, if the CPU usage is lower than an underload threshold, it will trigger the scale-in of the VNFs.  • Load Balancer is designed to fairly dispatch the incoming traffic to each VNF [25]. It is used only in the single slice scalable system (see Figure 4(c)) for distributing traffic. We use RabbitMQ [26], an open-source message-broker software implementing Advanced Message Queuing Protocol (AMQP) [27], based on Remote Procedure Call (RPC) to design our load balancer. In our system, Traffic Generator sends HTTP requests to Load Balancer and Load Balancer sends those requests to a load balancing queue. On the other side, each OM2M VNF as a server consumes those requests from the queue and replies a response back to Load Balancer. • Traffic Generator is a multi-thread program that we design to simulate three types of IoT traffic. It can set the number for each kind of ASN devices and the frequency of sending data. Three types of IoT traffic generated include video, adaptive lighting and smart parking. Each traffic is a stream of HTTP requests. 1) Video: This is to simulate a security surveillance service enabled by the video camera. It provides monitoring services for road traffic and crowd movement. This service has the highest bandwidth demand among all three types of traffic.
2) Adaptive lighting: This is to simulate an adaptive lighting service by the smart street light pole that monitors weather conditions and adapts the brightness of street lighting accordingly based on the inputs from temperature, humidity, air pollution and light sensors.
3) Smart parking: This is to simulate a smart parking service that monitors the availability of parking spaces based on geomagnetic sensors embedded in parking areas. This service has the lowest bandwidth requirement among these three types of IoT services. Figure 4 shows the differences of network slices in three systems under study. In our experiment, Traffic Generator will simulate the same three types of traffic. But the three systems would handle traffic in different ways.
As depicted in Figure 4(a), the single slice system architecture has only one network slice with one IoT platform as a VNF. The only one VNF needs to handle all types of IoT traffic. Then Figure 4(b) shows the multiple slicing system architecture that has three types of network slices. Each network slice is provisioned with a different customized bandwidth for dealing with a particular type of IoT service. At last, Figure 4(c) shows the design of the single slice scalable system architecture where only one network slice is provisioned but this slice supports scalability. Similar to the single slice system, its network slice needs to handle all three types of IoT traffic. But unlike the single slice system, it is capable of scaling its VNF to multiple instances and thus is equipped with a Load Balancer to evenly distribute IoT traffic to multiple VNF instances. However, because of Load Balancer, IoT traffic must go through an additional VNF, which may result in a longer response time than other systems.   There are four phases to run a network slicing system.

Network Service Lifecycle Management Flows
In the preparation phase, we set up the environment by first registering an OpenStack VIM to the Tacker

System Workflow for Scaling
The workflow of our system for scaling is shown in Figure 6. This scaling mechanism is only used by the single slice scalable system in our experiment. Note  scale-in threshold, it will trigger the scale-in action. However, if there is only one VNF left for the NS, the scale-in action will not be triggered. In addition, if there is already an action being executed, the next scale-out or scale-in action will not be triggered until the previous one ends.

Implementation and Evaluation in OpenStack
In this section, we show our test environment setup and experimental results.
Three types of traffic are simulated through Traffic Generator designed to evaluate the performance of each system. The evaluation metrics include throughput, average response time and CPU utilization.

Test Environment Setup
Our test environment consists of two servers. Both Tacker and OpenStack are running on these two servers configured as shown in Table 1. Table 2 shows the virtual resource allocation of each VNF in our environment.

Experimental Results
In our experiment, we use Traffic Generator to simulate three types of traffic  Table 3.
The required bandwidth is set to twice the expected traffic throughput to avoid temporary excessive traffic. For the single slice system and the single slice scalable system, we set the bandwidth to 1400 Kbps. For the multiple slicing system, the bandwidth limits are 1000 Kbps, 300 Kbps and 100 Kbps respectively with a total at 1400 Kbps that is the same as the other two systems for the fairness of comparison. The configuration of Traffic Generator in Table 4 would meet the expected traffic throughput in Table 3.
To test each system, there are three stages in our experiments. The whole process takes a total of 240 seconds. The payload size of each request sent by Traffic Generator will be based on the settings defined in Table 4. For the single slice scalable system, we follow the workflow for scaling as illustrated in Figure   6, and set the scale-out threshold to 50% and the scale-in threshold to 10% as in [29] [30].    • In the first stage, we will follow the configuration in Table 4 to send data to each system for 30 seconds. The requests of each application will be sent with different frequency and payload size. • In the second stage, we triple the number of user threads as shown in Table 4 and send data for 120 seconds. We simulate higher traffic in this stage. For the scalable system, the scale-out action will be triggered.
• In the final stage, we return to the same configuration as the first stage for 90 seconds. During this stage, the scalable system will trigger the scale-in action back to its original status. Table 5 shows the throughput of each application type in each stage. These three systems had a similar result of the throughput and reach the values of expected traffic throughput shown in Table 3. Figure 7 shows the average response times of all applications in each system and Figure 8 shows the total CPU utilizations of three systems. Integrating the information from these two charts, we conclude that the multiple slicing system achieves the best response time at all stages. Also, its total CPU utilization is only slightly higher than the single slice system so it is an acceptable tradeoff. Overall, the performance of the multiple slicing system is better than those of the single slice systems whether it is equipped with scalability or not.
Comparing the single slice system with the single slice scalable system, it is clear that the average response time of the single slicing system with scalability is better than the one without scalability. In the first stage, since the single slice scalable system must go through Load Balancer which is an additional VNF, the response time is longer than the single slice system. However, when the traffic load increases in the second stage, the response time of the single slice scalable system is similar to that of the single slice system. Moreover, the result of the single slice scalable system is even better in the final stage. This is because the single slice scalable system can deal with increasing traffic loads better than the single slice system. However, the total CPU utilization of the single slice scalable   system is always higher than the other two systems due to the overhead of Load Balancer and scalability.
The average response times of each application type in the single slice system over all testing stages are shown in Figure 9. The results show that the response time of each application type under the same system architecture only differs a little bit. Similar results are also exhibited in the other two systems. Figure 10 shows that the multiple slicing system gets the lowest response time regardless of the type of applications. We can thus conclude that the system architectures instead of the application types have deeper impact on the system performance.
According to the above results, we speculate that implementing the horizontal scalability across the multiple slicing system may improve its performance and stability, which will be our future work. The research in [7] proved that network slicing can improve IoT/M2M scalability and fulfill different QoS requirements, which is also proved by our experimental results. However, the network slicing in [8] is based on SDN, while ours is based on NFV that has been adopted by the upcoming 3GPP 5G architecture.

System Design, Implementation and Evaluation in Kubernetes
In this section, we report our research results of building the NFV MANO framework with Tacker as NFVO/VNFM and Kubernetes [31] as VIM. In this design, OM2M IN instances are deployed as the containerized VNFs.
Kubernetes is an open-source system for automating application deployment, scaling, and management. It provides a platform for deploying, managing, and scaling containerized applications across clusters of hosts. It works with a variety of container tools, including Docker.
We construct only two slicing systems in this experiment including: 1) a single slice system and 2) a multiple slicing system. At the end of this section, we will compare and evaluate these two systems in terms of their average response time and CPU utilization.  Figure 11 shows the general architecture of the two slicing systems. Each functional block of our systems has been presented before. Also, Tacker which is utilized as NFVO and VNFM has been introduced in Section 3. The only difference is that we now use Kubernetes instead of OpenStack as VIM.

System Architecture
Traffic Generator which is our design will simulate the same three types of traffic as our experiments in OpenStack. As depicted in Figure 12(a), the Single Slice System has only one network slice with one IoT platform as a containerized VNF. The only one network slice will deal with all types of IoT traffic. Figure   12(b) shows the architecture of the multiple slicing system where each network slice would handle a specific type of IoT services.

Test Environment Setup
Our test environment consists of two servers. Tacker and Kubernetes are each running on a server configured as shown in Table 6. Table 7 shows the virtual resource allocation of each containerized VNF in our environment.

Experimental Results
In this experiment, we use Traffic Generator to simulate three types of traffic and send the HTTP requests to each containerized VNF on the network slice.
For the single slice system, we send all three types of traffic to the OM2M IoT  Advances in Internet of Things platform directly. For the multiple slicing system, each type of IoT traffic will be sent to the containerized VNF on the corresponding network slice. The configuration of Traffic Generator is the same as the one used for Open-Stack as shown in Table 4 of Section 4. It will generate the expected traffic throughput required for each application type in this experiment as shown in Table 8.
To test each system, there are three stages in our experiment. The whole process takes a total of 90 seconds. The payload size of each request sent by Traffic Generator will be based on the settings defined in Table 4 of Section 4.
• In the first stage, we will follow the configuration in Table 4 to send data to each system for 30 seconds. The requests of each application will be sent with different frequency and payload size.
• In the second stage, we triple the number of user threads as shown in Table 4 and send data for 30 seconds. We simulate higher traffic in this stage.
• In the final stage, we return to the same configuration as the one in the first stage for 30 seconds. During this stage, the systems will approach stability. Because Kubernetes has its own scaling functions and policy for scalability, we only construct a single slice system and a multiple slicing system. Also, the time spent in the experiment for Kubernetes as VIM is different from the previous one for OpenStack as VIM. Since there was no need to do scalability, we shortened the total time of the experiment. Figure 13 shows the average response times of all applications in each system. It shows that the multiple slicing system achieves better response time at all stages although the response times of the two systems are similar in the first stage and the third stage. When the traffic load increases in the second stage, the response time of the multiple slicing system is half of that of the single slice system.   On the other hand, as depicted in Figure 14 the CPU utilization of the multiple slicing system is always higher than that of the single slice system in all three stages because it has three network slices to handle different services. Note that the CPU utilizations of both systems are at their peak in the second stage due to the highest traffic load. On the other hand, because the first stage is the warm-up stage, the CPU utilizations of both systems are higher than those in the final stage when the systems become stable.
Integrating the information from these two charts, we conclude that the performance of the multiple slicing system is better in general as its total CPU utilization is only slightly higher than that of the single slice system but it can achieve faster response time than the single slice system.

Conclusions and Future Work
In this paper, we propose three different slicing systems enabled by NFV, based on the MANO framework including: 1) a single slice system, 2) a multiple customized slices system and 3) a single but scalable network slice system to support IoT services. We utilize several open sources such as OpenStack, Tacker, Kubernetes, OM2M and RabbitMQ for constructing our system. In our system,   To support different kinds of IoT services, we customize each network slice with a specific QoS. Moreover, we design a Master Node to monitor the CPU usage of each VNF and scale out or scale in VNFs on the slice according to this information. Also, Load Balancer is designed for the single slice scalable system to dispatch traffic fairly.
In our experiment, we design Traffic Generator to simulate three types of IoT traffic including video, adaptive lighting and smart parking. The test traffic consists of three stages with different traffic loads. We measure the average response time and the CPU utilization of these three systems to identify the best system design. Comparing the results of these three systems, the multiple slicing system has the best performance among them. In addition, the single slicing system with scalability is more stable than the system without scalability with the tradeoff of higher CPU utilization.
Combining the results of the two experiments, the multiple slicing system is the best system design. Although in our experiment with Kubernetes as the VIM, we only constructed the first two systems. The results also show that the performance of the multiple slicing system is better than that of the single slice system.
In the future, we plan to construct a network slicing system with vertical scalability by adapting to changing QoS requirements dynamically. We also plan to experiment the horizontal scalability across multiple slices than just on a single slice. Moreover, constructing a hybrid system of horizontal and vertical scalability to meet more diverse requirements of IoT services is also a potential future research direction [32].