Non-Intrusive Load Identification Model Based on 3D Spatial Feature and Convolutional Neural Network ()
1. Introduction
With the continuous innovation of load identification related technologies, non-intrusive load monitoring systems will also be widely used in the power management side and the residential user domain. NILM was proposed by Hart as early as the 1980s, whose core is to disaggregate the load power consumption [1]. As one of the important components of NILM, many scholars have proposed many methods in the study of non-intrusive load identification. The methods commonly used include mathematical methods [2] [3], machine learning [4] [5], and optimization algorithms [6] [7]. With the rapid development of deep learning technology, many experts and scholars apply it to non-invasive load identification and have achieved excellent results. Reference [8] proposes a load identification method based on Recurrent Neural Networks (RNN) model, which memorizes historical input features, establishes the internal relationship between input mapping and output, and realizes reliable identification of load feature tags. Reference [9] proposes a NILM-based energy management system for appliance-level load monitoring service and a convolution neural network based on differential input. The experiment results show that the proposed network with small size achieves better effect than other comparative models. A resident electricity load identification algorithm based on convolution neural network (CNN) is studied [10]. The algorithm turns the method based on current vector feature extraction to the method of extracting current picture features during appliance operation. The CNN model is established and trained for typical household appliance loads through load operation current to achieve the purpose of identification.
Compared with optimization algorithm and machine learning, neural network has better observation ability in processing high-dimensional data features. Therefore, a non-intrusive load identification model based on 3D convolution neural network in the background of non-intrusive load monitoring is proposed in the paper. Firstly, aiming at the problem that the binary V-I trajectory image cannot reflect the power characteristics of the loads, a new feature is produced by mapping the power P to the third dimension, i.e., add the third dimension to the binary V-I trajectory image to make it a 3D feature. The new 3D feature includes current, voltage and power characteristics, which have better observability than binary V-I trajectory image. Then, through using the feature extraction function of 3D convolutional neural network, the new feature is flattened into one-dimensional feature, that is, the three-dimensional spatial feature is transformed into one-dimensional array, which is convenient for later load identification. Finally, the experimental results show that the new 3D feature have better observability and the proposed model has higher identification performance compared with other classification models on the public data set PLAID.
2. Processing of Imbalance Data Sets and Identifiable Feature Selection
2.1. Introduction of Data Sets Used
This paper uses the public data set PLAID to train and evaluate the model [11]. This data set records the measurement values of current and voltage in 11 different types of electrical loads from 55 households in Pittsburgh, Pennsylvania, USA, which contains 1074 sets of samples of 235 independent electrical equipment data with 30 kHz sampling frequency.
Due to the large sampling randomness in the data contained in PLAID, the sample imbalance phenomenon is caused. Such as majority of samples, compact fluorescent lamps, hair dryers, and notebooks, there are as many as 175 groups, 156 groups, and 172 groups respectively, while the minority types sample, refrigerators, heaters and washing machines, there are only 38 groups, 35 groups, and 26 groups of data respectively. For the majority of equipment categories, the model can have enough features for learning. But the minority samples lack comparable features to the former, which results in deviations in the identification accuracy of the model. Based on this problem, an oversampling technique should be used to process the imbalance data set by taking the category with the majority samples as the standard.
Oversampling technique is a method commonly used for equalizing unbalanced data sets, but there are some noise samples near the majority and minority samples in practical problems. If the minority samples are oversampled in the case, more noise samples will be generated, which will mislead the classifier and reduce the classification accuracy. The difficulty of oversampling technology is how to deal with noise data so that the distribution of new data is similar to that of original data, thus improving the classification accuracy.
2.2. SVM SMOTE Algorithm
Aiming at the problems of data distribution marginalizing and classification boundary blurring in traditional SMOTE, the boundary region can be focused to achieve better classification performance. Using boundary samples is essential to estimate the best classification boundary. Therefore, only the minority samples need to be synthesized along the classification boundary during oversampling, instead of sampling all minority samples. Therefore, a method based on the support vector machine (SVM) synthetic minority over-sampling technique (SMOTE) is employed in this paper, and the boundary region is the support vector approximation obtained after the SVM classifier is trained on the initial training set [12].
SVM SMOTE selects different decision mechanisms to synthesize minority samples according to the density of the majority samples in the minority support vector. Suppose the number of majority samples is k among the K nearest neighbor samples of minority sample
. If
, then
is considered as a noise sample and needs to be reselect again. If
, use extrapolation method to
. If
, use interpolation method to
.
The essence of the SVM SMOTE algorithm is based on the oversampling of the support vector. In order to improving the accuracy of classification, near the classification boundary of the support vector, the minority samples are generated according to the decision-making mechanism and expanded to areas where the majority samples density is not high. The distribution of sample data before and after SVM SMOTE processing is listed in Table 1.
Where label 1 means Air Conditioner, label 2 means Compact Fluorescent Lamp, label 3 means Fan, label 4 means Fridge, label 5 represents Hairdryer, label 6 indicates the Heater, label 7 indicates the Incandescent Light Bulb, label 8 indicates the Laptop, label 9 represents Microwave, label 10 means Vacuum, and label 11 indicates the Washing Machine.
The minority samples are expanded, and the initial samples are increased to 1925 after the unbalanced data is processed by the SVM SMOTE method, where the number of samples for each type of electrical appliances is 175 respectively.
2.3. Feature Selection
2.3.1. Binary Voltage-Current Trajectory
The binary V-I trajectory feature maps the initial V-I trajectory to a matrix with a certain size, which can almost reflect the features of the initial V-I trajectory image. Power feature is one of important parameters that reflects load characteristic. However, the V-I trajectory image cannot reflect the power characteristics of the appliances. In order to improve the accuracy of load identification, we add the power feature P on the basis of the binary V-I trajectory feature.
2.3.2. Power Characteristic
Since the sampling signal of binary voltage-current trajectory image is high-frequency voltage and current data, it is necessary to obtain the active power of the household loads during steady-state operation according to the high-frequency discrete sampling data [13]. FFT is employed to extract the power characteristics of the appliances in this paper [14].
In this paper, the sampling number of voltage and current signals in a steady-state period is
. Suppose the k-order frequency domain signal after FFT is
, and its modulus are
and phase angle are
. The effective value of different orders voltage and current and phase angle are
,
,
(1)
where
and
are effective value of fundamental voltage and current respectively,
and
are effective value of kth harmonic voltage and
Table 1. Distribution of the samples before and after SVM SMOTE.
current respectively,
and
are phase angle of kth harmonic voltage and current respectively,
and
are the signals of voltage and current respectively.
Time-domain voltage and current signals can be respectively expressed as:
(2)
(3)
where
is the angular frequency.
The average active power is
(4)
2.3.3. New 3D Feature
Inspired by the binary V-I trajectory, a new feature is produced by mapping the power feature P to the third dimension, i.e., we change the initial binary V-I trajectory into a 3D object.
The acquisition method of new 3D feature is as follows.
1) Collect the high frequency voltage and current waveforms of the device during a steady-state cycle and the power characteristics of the device after FFT processing. It is assumed that there are p sampling points in a period and the 3D feature is composed of a spatial shape with the size of
.
2) The voltage, current and power values in the steady-state period are linearly converted to integers among
. The conversion formulas of voltage, current and power at each sampling point are as follows:
,
,
. (5)
where
. p is the one-dimensional size of the spatial shape.
,
and
are the current, voltage and power values of the mth sampling point of the original data respectively.
,
and
are the converted current, voltage and power values of the mth sampling point respectively.
、
and
are the minimum values of current, voltage and power in a steady state cycle respectively.
,
and
are the maximum values of current, voltage and power in a steady state cycle.
is the rounding down symbol. Produce a spatial shape with all elements of 0 and size of
, select each sampling point of the steady-state cycle from the first one to the last one, and assign the element of the block at the corresponding positions of
,
and
to 1 to get the new feature with the size of
. The diagram for generating the new features is as Figure 1.
3. Three-Dimensional Convolutional Neural Network (3D-CNN)
3.1. The Overview of 3D-CNN
The huge advancements achieved using CNN on 3D objects [15] [16] and other
Figure 1. The diagram for generating the new features.
fields, which inspires us to discuss in this orientation. CNN can directly encode the spatial structure of the input, that is, the plane and angle of a 3D object associated with different directions and positions. In addition, CNN can stack multiple layers to make hierarchies including complex features about 3D regions and ultimately provides a global label for 3D input. Moreover, well-trained CNN models can be easily deployed to hardware platforms to perform inference by only using a feed-forward propagation which is very effective for classification.
3.2. 3D CNN Configuration
In order to train the 3D CNN on our newly produced feature, we have designed a network architecture as shown in Figure 2. The input to the network is
as illustrated in Section 3. There are 2 convolutional layers, 2 max pooling layers, 2 fully connected layers and a softmax layer producing the output. Furthermore, we also add a dropout layer and a flatten layer. The filter with size of
and stride length of 1 are configured on the convolutional layers and padding uses “same” strategy. The max pooling with size
and stride length of 2 is set on pooling layer. In the configuration, the first convolutional layer produces 12 feature blocks with size 283 and the second convolutional layer produces 24 feature blocks with size 143. Likewise, the first pooling layer obtains 12 feature blocks with size 143 and the second pooling layer obtains 24 feature blocks with size 73. After that we add 2 fully connected layer to produce 11 units and the final output gives the probability for different classes based on the input. An optimization algorithm, adaptive moment estimation (Adam) [17], is used for training the 3D-CNN model. The learning rate for the training of the network is 0.0001.
4. Experimental Verification and Result Analysis
4.1. Evaluation Metrics
In this paper, the accuracy and confusion matrix are used to evaluate the identification results. The accuracy rate represents the proportion of the number of
samples correctly classified to the total number of samples in the test set. The equation of accuracy is as follow:
. (6)
where the
is the total number of samples in the test set and the
is the number of correctly identified samples.
4.2. Analysis of Experimental Results
Confusion matrix is widely used because it can intuitively reflect the effect of model classification, the darker the main diagonal color is, the higher the accuracy of model recognition is in the figure. The confusion matrix when using different features is shown in Figure 3, where the accuracy of load identification using power feature is 0.762, the accuracy of load identification using binary V-I trajectory feature is 0.834, and the accuracy of load identification using 3D feature is 0.949. It is obvious from the figure that the accuracy of load identification using three-dimensional features is higher than that of using the other two single features. This is because the 3D feature combines the binary V-I trajectory feature and power feature to make it have higher identifiability. Compared with the two single features, it can be seen from the figure that 3D feature makes the model have better identification performance as well, and can completely identify samples 2, 6, 10, and 11.
In order to further explore the load identification ability of each classification algorithm and the effectiveness of the 3D feature, the experiments are conducted with different classification models and different load features, and the results are shown in Table 2. In this paper, long short-term memory network (LSTM) is employed for comparative experiments besides machine learning models such as extreme gradient boosting (XGBoost), SVM and K-nearest neighbor (KNN). For different load features, in addition to the above three features, the current and voltage features are added as well. It is worth mentioning that the machine learning model cannot recognize the image feature and spatial feature directly, it is necessary to flatten the high-dimensional features into one-dimensional array by convolution neural network. As can be seen from Table 2, other classification
Figure 3. Confusion matrix when using different features.
Table 2. Identification performance of different classification models for different features.
models show high recognition accuracy for 3D features except LSTM, because LSTM has good generalization for one-dimensional sequence features rather than being good at processing image and spatial features. Moreover, CNN has superior observability for binary V-I trajectory graphs and 3D features, especially for the latter, the identification accuracy reached the highest 0.949.
5. Conclusions
With the development of non-invasive load monitoring and the wide application of deep learning, the improvement of load identification model and algorithm is promoted. In the paper, a non-intrusive load identification model based on 3D convolution neural network in the background of non-intrusive load monitoring is proposed. The conclusions are as follows.
1) Imbalance samples will cause deviations in model identification accuracy. Therefore, an oversampling technique based on SVM SMOTE algorithm is applied to process the imbalance samples by taking the category with the majority samples as the standard. The minority samples are expanded, and the initial samples are increased from 1074 to 1925 by the method, which makes load identification effect of the model better by keeping the minority samples and the majority samples in balance.
2) The binary V-I trajectory feature maps the initial V-I trajectory to a matrix with a certain size, which can almost reflect the features of the initial V-I trajectory image. This method not only reduces the data dimension but also discards some redundant data, which makes it easier to reflect the discrepancies of characteristics of different appliances. Power feature is one of important parameters that reflects load characteristic. However, the V-I trajectory image cannot reflect the power characteristics of the appliances. In order to improve the accuracy of load identification, we adopt the power feature P on the basis of the binary V-I trajectory feature by adding the third dimension to the binary V-I trajectory image to make it a 3D feature. In order to explore whether 3D features make the load identification models more generalized, comparison experiments with other one-dimensional features and binary V-I trajectory image are performed, and the results showed that the new features have better observability.
3) CNN has good generalization ability for high-dimensional features. Because CNN can extract effective characteristics from high-dimensional features through its convolution layer and pooling layer, and finally high-dimensional features are flattened into one-dimensional features by flatten layer, which is beneficial for later load identification. The results show that CNN has better accuracy on high-dimensional features by comparing with other classification models.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61572416), Natural Science Foundation of Hunan Province (2020JJ6009), Open subject of The State Key Laboratory of Heavy Duty AC Drive Electric Locomotive Systems Integration and Open subject of The State Key Laboratory of Disaster Prevention and Mitigation for Power Grid Transmission and Transformation Equipment.