^{1}

^{2}

^{1}

^{2}

Randomized weights neural networks have fast learning speed and good generalization performance with one single hidden layer structure. Input weighs of the hidden layer are produced randomly. By employing certain activation function, outputs of the hidden layer are calculated with some randomization. Output weights are computed using pseudo inverse. Mutual information can be used to measure mutual dependence of two variables quantitatively based on the probability theory. In this paper, these hidden layer’s outputs that relate to prediction variable closely are selected with the simple mutual information based feature selection method. These hidden nodes with high mutual information values are maintained as a new hidden layer. Thus, the size of the hidden layer is reduced. The new hidden layer’s output weights are learned with the pseudo inverse method. The proposed method is compared with the original randomized algorithms using concrete compressive strength benchmark dataset.

Machine learning (ML)-based data analysis has been a hot focuses in different disciplines. The most used learning prediction model construction methods are backup propagation neural networks (BPNN) and support vector machines (SVM) [

Mutual information (MI) can be used to measure the mutual dependence of the two variables quantitatively based on the probability theory and information theory. Thus, it has been used widely in feature selection. The MI is more comprehensive than the other normal feature selection methods for select optimal input variables [

Motivated by the above problems, a modified randomized weight neural networks based on MI is proposed in this paper. At first, the input variables and the random chosen input weights feed into certain activation function to produce outputs of the hidden layer. Then, MI values between these hidden layer’s output and predicted variables are calculated, and these outputs with MI values higher than a preset threshold are selected. At last, pseudo inverse method is used to compute weights between these selected hidden layer’s outputs and predicted variable. Therefore, input weights’ randomization is controlled in some degrees. Simulation based on concrete compressive strength benchmark dataset is used to validate the proposed method.

Suppose that SLFNs with

where,

Then, Equation (1) can be rewritten as:

where,

Theoretically, SLFNs are able to approximate any continuous target functions with enough hidden layer nodes using the randomized input weights. Give a training set

The solution can be analytically determined by the expression below:

where

The reason of using Moore-Penrose generalized inverse is that matrix

In particular, when

And when

Information entropy can quantify the uncertainty of the random variables and scale the amount of information shared by these variables. Thus, it has been widely used in many fields. The entropy can be represented as:

where,

Mutual information (MI) can measure the mutual dependence of two variables, which is defined as:,

where,

For the continuous random variables,

Mutual information feature select (MIFS) algorithm can be described as: calculate MI values between each input feature and output variable, then select the input features with the bigger MI values and penalize the others features have the bigger MI values with the selected features, and obtain the best input feature sub-set using the greedy search method [

A simple method based on MI is: 1) Calculate MI values between each input feature and output variables; 2) Given a pre-set threshold value of the MI based on prior knowledge; 3) The features with higher MI values than the threshold are selected. How to select the optimal pre-threshold value is an open question.

The proposed MI based modified randomized weights neural networks model are shown in

As shown in

Given that pre-set threshold value

We denote these hidden layer’s outputs with

where,

Therefore,

Consideration problem of the learning parameters’ selection, the MI based randomized weights algorithms can be represented as the following optimization problem:

Some intelligent optimization methods can be used to address this problem.

Concrete compressive strength data obtained by the experimental studies of the group led by I.C. Yeh in Taiwan

Chung Hua University [

Given that L = 300, the MI values between hidden layer’s outputs and predicted variable are shown in

The original randomized weights algorithm and MI based modified version are compared with different hidden nodes’ number and different MI pre-set threshold values. In order to overcome the randomization of the initial weights, the mean root mean square errors (MRMSEs) with repeated 100 times are used to estimate the model’s prediction accuracy. Statistical results are shown in

Original method (MRMSEs) | MI based modified method (MRMSEs, | ||||||||
---|---|---|---|---|---|---|---|---|---|

0.1* | 0.2* | 0.3* | 0.4* | 0.5* | 0.6* | 0.7* | |||

L = 10 | 12.37 | (12.41, 10) | (12.27, 9.8) | (12.57, 9.1) | (13.97, 6.6) | (14.76, 4.9) | (16.42, 3.1) | -- | 0.2703 |

L = 20 | 10.36 | (10.25, 20) | (10.31, 19.2) | (10.73, 15.4) | (11.73, 9.85) | (13.60, 6.03) | (15.04, 3.75) | -- | 0.3319 |

L = 30 | 9.713 | (9.675, 29.9) | (9.702, 28.1) | (10.16, 21.3) | (10.95, 13.3) | (12.33, 8.07) | (14.77, 4.69) | -- | 0.3572 |

L = 40 | 9.518 | (9.444, 39.9) | (9.621, 37.2) | (9.787, 27.2) | (10.61, 16.2) | (11.78, 9.11) | (13.89, 5.09) | -- | 0.3707 |

L = 50 | 9.755 | (9.799, 49.9) | (9.638, 46.2) | (9.591, 33.2) | (10.21, 19.3) | (11.21, 11.6) | (13.19, 6.37) | -- | 0.3755 |

L = 60 | 10.17 | (10.12, 59.8) | (9.772, 53.9) | (9.486, 36.2) | (10.16, 19.7) | (11.26, 10.8) | (13.02, 6.36) | -- | 0.4042 |

L = 70 | 10.38 | (10.71, 69.8) | (10.14, 62.9) | (9.453, 41.8) | (9.973, 23.1) | (11.02, 12.83) | (12.14, 7.19) | -- | 0.4050 |

L = 80 | 11.18 | (11.33, 79.8) | (10.62, 70.5) | (9.625, 44.8) | (9.738, 24.1) | (10.79, 13.0) | (12.68, 7.15) | -- | 0.4185 |

L = 90 | 12.48 | (12.22, 89.7) | (11.24, 80.2) | (9.634, 53.3) | (9.736, 27.93) | (10.56, 15.28) | (11.50, 8.96) | (13.72, 4.82) | 0.4113 |

L = 100 | 13.30 | (13.06, 100) | (12.88, 99.0) | (12.20, 89.6) | (10.47, 70.5) | (9.885, 48.5) | (9.614, 32.5) | (10.00, 22.6) | 0.4186 |

L = 200 | 535.1 | (488.6, 199) | (47.26, 1.65) | (12.38, 94.0) | (9.678, 46.92) | (10.01, 23.87) | (10.95, 12.51) | (12.46, 5.99) | 0.4612 |

L = 300 | 167.5 | (166.4, 2.98) | (253.1, 240) | (20.24, 128) | (10.69, 15.53) | (9.646, 31.5) | (10.51, 15.4) | (12.05, 7.33) | 0.4798 |

L = 400 | 132.2 | (135.3, 397) | (156.7, 314) | (58.05, 167) | (11.66, 81.5) | (9.874, 41.2) | (10.24, 19.9) | (11.26, 9.53) | 0.4854 |

L = 500 | 121.4 | (118.1, 496) | (379.7, 382) | (512.9, 197) | (12.56, 93.6) | (9.957, 45.1) | (10.16, 21.6) | (10.95, 11.0) | 0.5070 |

This paper proposes new mutual information based randomized weights neural networks. Input weights of the hidden layer are produced randomly as normal randomized algorithm. Not all the outputs of the hidden layer are used to compute output weights. Mutual information based simple feature selection method is used to select hidden layer’s outputs. These selected outputs are used to compute weights of hidden layer with pseudo inverse method. Concrete compressive strength benchmark dataset is used to validate this method. More researches will address some theoretically analysis and to validate this idea with more benchmark datasets.

The research was sponsored by the post doctoral National Natural Science Foundation of China (2013M532118, 2015T81082), National Natural Science Foundation of China (61573364, 61273177), State Key Laboratory of Synthetical Automation for Process Industries, China National 863 Projects (2015AA043802).

Jian Tang,Zhiwei Wu,Meiying Jia,Zhuo Liu, (2015) Mutual Information-Based Modified Randomized Weights Neural Networks. Journal of Computer and Communications,03,191-197. doi: 10.4236/jcc.2015.311030