^{1}

^{2}

Recently, regression artificial neural networks are used to model various systems that have high dimensionality with nonlinear relations. The system under study must have enough dataset available to train the neural network. The aim of this work is to apply and experiment various options effects on feed-foreword artificial neural network (ANN) which used to obtain regression model that predicts electrical output power (EP) of combined cycle power plant based on 4 inputs. Dataset is obtained from an open online source. The work shows and explains the stochastic behavior of the regression neural, experiments the effect of number of neurons of the hidden layers. It shows also higher performance for larger training dataset size; at the other hand, it shows different effect of larger number of variables as input. In addition, two different training functions are applied and compared. Lastly, simple statistical study on the error between real values and estimated values using ANN is conducted, which shows the reliability of the model. This paper provides a quick reference to the effects of main parameters of regression neural networks.

Electricity has been one of the main essential resources to the human activities. Power plant has been established to provide human communities with the needed amount of electricity. Power provided from power plants fluctuates through the year due to many reasons including the environmental conditions. The accurate analysis of thermodynamic power plants using mathematical models requires high number of parameters and assumptions, in order to represent the actual system unpredictability [

Artificial neural networks (ANNs) were originally proposed in the mid-20th century as a computational model of the human brain. Their use was limited due to the limited computational power available at the time, and some unsolved theoretical problems. However, they have been increasingly studied and applied with the recent existence of higher computational power and the availability of datasets [

Researches have considered ANN to model many various engineering systems [

One kind of power plants is the combined cycle power plant (CCPP), which is composed of gas turbines (GT), steam turbines (ST) and heat recovery steam generators (HRSG) (

In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle [

by routing the waste heat from the gas turbine to the nearby steam turbine, which generates extra power [

1) Fuel burns at the gas turbine, makes the turbine blades spinning and driving electricity generators.

2) Heat Recovery Steam Generator (HRSG) captures exhaust heat from the gas turbine. The HRSG creates steam from the gas turbine exhaust heat and delivers it to the steam turbine.

3) Steam turbine uses the steam delivered by the heat recovery system to generate additional electricity by driving an electricity generator.

Gas turbine load is sensitive to the ambient conditions; mainly ambient temperature (AT), atmospheric pressure (AP), and relative humidity (RH). However, steam turbine load is sensitive to the exhaust steam pressure (or vacuum, V) [

Combined cycle power plants (CCPPs) have a higher fuel conversion efficiency compared to the conventional power plants, i.e. consuming less fuel to produce the same amount of electricity, which results in lower power price and less emission to the environment [

Artificial neural networks (ANNs) or connectionist systems are a computational model used in computer science and other research disciplines, which is based on a large collection of simple neural units (artificial neurons), loosely analogous to the observed behavior of a biological brain’s axons. Each neural unit is connected with many others, and links can enhance or inhibit the activation state of adjoining neural units. Each individual neural unit computes using summation function. There may be a threshold function or limiting function on each connection and on the unit itself, such that the signal must surpass the limit before propagating to other neurons. These systems are self-learning and trained, rather than explicitly programmed, and excel in areas where the solution or feature detection is difficult to express in a traditional computer program [

Major advantage of using ANN is non-parametric model while most of statistical methods are parametric model that need higher background of statistic. In addition, ANNs easily handles highly non-linear modelling (main advantage). However, ANN is a black box learning approach, cannot interpret relationship between input and output and cannot deal with uncertainties [

Neural networks are good at fitting functions. In fact, there is proof that a fairly simple neural network can fit any practical function [

Neural Network Toolbox™ provides algorithms, functions, and apps to create, train, visualize, and simulate neural networks. It includes algorithms and tools for regression, pattern recognition, classification, clustering, deep learning, time series and dynamic systems, and many others which cover the usage of ANN models [

1) Collect data

2) Create the network

3) Configure the network

4) Initialize the weights and biases

5) Train the network

6) Validate the network

7) Use the network

Some of these steps could be done automatically using default values and settings in the toolbox; however, user can set every detail by himself. Neural Network Toolbox offers four levels of design i.e. four different levels at which the Neural Network Toolbox™ software can be used.

The first level is represented by the GUIs. These provide a quick way to access the power of the toolbox for many problems of function fitting, pattern recognition, clustering and time series analysis. In addition a.m Matlab code script can be generated with the desired level of details copying settings used in the network study.

The second level of toolbox use is through basic command-line operations. The command-line functions use simple argument lists with intelligent default settings for function parameters. (Users can override all of the default settings, for increased functionality.)

A third level of toolbox use is customization of the toolbox. This advanced capability allows user to create custom neural networks, while still having access to the full functionality of the toolbox.

The fourth level of toolbox usage is the ability to modify any of the code files contained in the toolbox. Every computational component is written in MATLAB® code and is fully accessible.

Regression (Fit Data) in Neural Network Toolbox in Matlab can be accessed using GUI or command-line functions [

1) Neural Network tool (nntool), which is the general neural network tool, offers full control of settings. Using this GUI, user can design any type of neural network, not only the regression ANN.

2) Neural Fitting tool (nftool), which leads user through solving a data fitting problem, solving it with a two-layer feed-forward network trained with Levenberg-Marquardt or scale conjugate gradient back-propagation. It has limited set of options. User can select data from the MATLAB® workspace or use one of the example datasets. After training the network, evaluate its performance using mean squared error and regression analysis. Further, analyze the results using visualization tools such as a regression fit or histogram of the errors. User can then evaluate the performance of the network on a test set.

Aim of this work is to apply and experiment various options effects on feed-foreword artificial neural network (ANN) which used to obtain regression model that predicts electrical output power (EP) of combined cycle power plant based on 4 inputs. More specifically, this work uses MATLAB neural networks toolbox to study stochastic behavior of the regression neural, effect of number of neurons of the hidden layers, effect of data subset size for training, effect of number of variables as input, different training functions results, data preprocessing, and statistical error study.

In this study, MATLAB neural networks toolbox is used; database is obtained freely from [

1) Test: refers to the test of the whole dataset (9568 observations) which gives results that are more realistic.

2) Performance: means squared errors (MSE).

The main scheme in this study is conducting comparisons between resulted networks using various variations of options. Comparison will always be between the performances of networks on the whole dataset (Test dataset). The following subsections describe the data, shows how training sub dataset is obtained, illustrates which features (inputs) to be studied, discusses data normalization and shows selection of neural network structure size.

Dataset is obtained from online site [

Feature | Symbol | Type | minimum | maximum | unit |
---|---|---|---|---|---|

Ambient temperature | AT | Input | 1.81 | 37.11 | ˚C |

Ambient Pressure | AP | Input | 992.89 | 1033.30 | mbar |

Relative Humidity | RH | Input | 25.56% | 100.16% | ---- |

Exhaust Vacuum | V | Input | 25.36 | 81.56 | cm Hg |

Net hourly electrical energy output | EP | Output | 420.26 | 495.76 | MW |

Since the main goal of this work is to apply and test regression with neural network, so no need for all the dataset. Only a smaller subset is systematically picked from the dataset to train, validate and initial test of the network. Matlab Neural Network toolbox divides the dataset to subsets of train, validate and test, by default percentages as 75%, 15% and 15%. Since we have huge dataset, we can run the test on it, so we will reduce the test subset to 0%, training subset will be 75% and validation subset will be 25%. Finally, test is performed with all the data points from the original dataset and compared with different subset sizes.

One way to improve training networks is to linearly normalize inputs and outputs to certain range. The standard normalization maps the feature to the range of (−1, 1); which is default in the Matlab Neural Network toolbox; both inputs and outputs are normalized by default. Other range may be used, e.g. (0.01, 0.09). Another normalization practice is to map feature to a range with specified mean and variance. Typically the mean would be (0) and standard derivation would be (1). Since Matlab Neural Network toolbox makes this step for us, we do not have to worry about it. However, mapping to range (0.01, 0.09) is also applied and results are compared to the use of not normalized data.

As shown in

The linear relations strength between the variables are shown in ^{2}. It can be seen that AT and V are strongly linearly related to each other and to the output PE, while AP and RH have weak linear relations to all other variables and output.

Although it is obvious that the governing variables are AT and V, the effect of absence and presence of each of variables is studied.

We are using the tan-sigmoid transfer function in the hidden layer, and a linear output layer. This is the standard network for function approximation [

Here we will examine two algorithms: (trainlm) Levenberg-Marquardt algorithm (default) and (trainbr) Bayesian regularization algorithm.

Input | R | R^{2} | description |
---|---|---|---|

AT | −0.9481 | 0.8989 | Negative & strong |

V | −0.8698 | 0.7565 | Negative & strong |

AP | 0.5184 | 0.2688 | Positive & weak |

RH | 0.3898 | 0.1519 | Positive & weak |

R | AT | V | AP | RH | PE |
---|---|---|---|---|---|

AT | 1 | 0.84 | −0.51 | −0.54 | −0.95 |

V | 0.84 | 1 | −0.41 | −0.31 | −0.87 |

AP | −0.51 | −0.41 | 1 | 0.1 | 0.52 |

RH | −0.54 | −0.31 | 0.1 | 1 | 0.39 |

PE | −0.95 | −0.87 | 0.52 | 0.39 | 1 |

R^{2} | AT | V | AP | RH | PE |

AT | 1 | 0.71 | 0.26 | 0.29 | 0.9 |

V | 0.71 | 1 | 0.17 | 0.1 | 0.76 |

AP | 0.26 | 0.17 | 1 | 0.01 | 0.27 |

RH | 0.29 | 0.1 | 0.01 | 1 | 0.15 |

PE | 0.9 | 0.76 | 0.27 | 0.15 | 1 |

The number of neurons in the hidden layer will depend on the function to be approximated. This is something that cannot generally be known before training. Levenberg-Marquardt algorithm needs the number of neurons (hidden layer size) to be given to the algorithm. However, the effect of the hidden layer size will be examined in this study by applying a variety of hidden layer sizes.

Here are the results of many options variation on the neural networks:

Training the same network with the same settings and the same dataset gives different output each run; because of:

1) The randomness of the initial weights and bias at every training run of the neural network.

2) The randomness of dividing dataset into train, validate and test sets.

Results are shown at

To examine the effect of the hidden layer size the network is trained with the settings shown at

From

Dataset size | 50 |
---|---|

Variables used | AT, V, AP and RH |

Hidden layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt (trainlm) |

Run | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch |
---|---|---|---|---|---|---|---|---|---|

1 | 2.71 | 97.78 | 1 | 0.8 | 67.79 | 0.88 | Validation | 12 | 6 |

2 | 22.47 | 41.17 | 0.98 | 0.9 | 65.75 | 0.89 | Validation | 11 | 5 |

3 | 21.7 | 117.89 | 0.97 | 0.84 | 138.66 | 0.84 | Validation | 12 | 6 |

4 | 70.82 | 72.25 | 0.95 | 0.89 | 123.49 | 0.9 | Validation | 12 | 6 |

5 | 44.96 | 70.81 | 0.94 | 0.91 | 140.11 | 0.78 | Validation | 10 | 4 |

6 | 16.66 | 96.47 | 0.98 | 0.88 | 94.97 | 0.86 | Validation | 10 | 4 |

7 | 11.87 | 117.59 | 0.98 | 0.94 | 59.11 | 0.9 | Validation | 11 | 5 |

8 | 6.03 | 106.22 | 0.99 | 0.92 | 32.35 | 0.94 | Validation | 16 | 10 |

9 | 15.65 | 47.82 | 0.98 | 0.94 | 52.78 | 0.92 | Validation | 13 | 7 |

10 | 40.16 | 66.53 | 0.95 | 0.86 | 68.51 | 0.9 | Validation | 9 | 3 |

Dataset size | 50 |
---|---|

Variables used | AT, V, AP and RH |

Hidden layer size (#neurons) | [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 70, 100] |

Training Function | Levenberg-Marquardt (trainlm) |

Hidden Layer Size | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch |
---|---|---|---|---|---|---|---|---|---|

1 | 34.38 | 30.09 | 0.95 | 0.97 | 27.21 | 0.97 | Max mu. | 23 | 22 |

2 | 27.98 | 39.5 | 0.96 | 0.93 | 24.76 | 0.96 | Validation | 9 | 3 |

3 | 22.1 | 43.33 | 0.97 | 0.94 | 23.56 | 0.96 | Validation | 18 | 12 |

4 | 21.38 | 67.48 | 0.97 | 0.93 | 24.12 | 0.96 | Validation | 13 | 7 |

5 | 18.98 | 68.08 | 0.97 | 0.95 | 25.59 | 0.96 | Validation | 9 | 3 |

6 | 27.78 | 33.05 | 0.95 | 0.98 | 26.53 | 0.95 | Validation | 11 | 5 |

7 | 25.9 | 67.61 | 0.97 | 0.92 | 46.96 | 0.95 | Validation | 9 | 3 |

8 | 12.4 | 82.42 | 0.98 | 0.85 | 36.1 | 0.94 | Validation | 19 | 13 |

9 | 27 | 58.42 | 0.96 | 0.95 | 38.19 | 0.95 | Validation | 10 | 4 |

10 | 16.74 | 70.42 | 0.98 | 0.91 | 38.33 | 0.94 | Validation | 10 | 4 |

15 | 50.09 | 40.39 | 0.96 | 0.93 | 93.12 | 0.92 | Validation | 9 | 3 |

20 | 2.56 | 113.75 | 1 | 0.87 | 148.28 | 0.82 | Validation | 13 | 7 |

25 | 17.9 | 94.46 | 0.98 | 0.92 | 104.4 | 0.88 | Validation | 9 | 3 |

30 | 616.45 | 460.98 | 0.85 | 0.85 | 817.02 | 0.85 | Validation | 7 | 1 |

40 | 0.14 | 366.38 | 1 | 0.76 | 195.48 | 0.76 | Validation | 10 | 4 |

50 | 49.72 | 365.31 | 0.96 | 0.84 | 551.68 | 0.76 | Validation | 6 | 2 |

70 | 1.88 | 102.07 | 1 | 0.89 | 256.06 | 0.66 | Min gradient | 6 | 2 |

100 | 3.6 | 1589.4 | 1 | 0.22 | 925.24 | 0.63 | Min gradient | 10 | 4 |

Here we will examine different sizes of train datasets, which actually train and validate datasets. Settings for this experiment are shown at

As shown at

Dataset size | [30, 40, 50, 60, 100, 150, 200, 250, 300] |
---|---|

Variables used | AT, V, AP and RH |

Hidden layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt (trainlm) |

Dataset Size | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch |
---|---|---|---|---|---|---|---|---|---|

30 | 42.13 | 68.15 | 0.94 | 0.96 | 45.49 | 0.92 | Validation | 8 | 2 |

40 | 15.91 | 91.37 | 0.99 | 0.81 | 47.73 | 0.94 | Validation | 8 | 2 |

50 | 14.54 | 108.62 | 0.98 | 0.85 | 35.85 | 0.94 | Validation | 11 | 5 |

60 | 25.87 | 15.17 | 0.96 | 0.97 | 24.79 | 0.96 | Validation | 11 | 5 |

100 | 21.77 | 18.57 | 0.97 | 0.97 | 24.55 | 0.96 | Validation | 11 | 5 |

150 | 20.06 | 17.45 | 0.97 | 0.97 | 20.66 | 0.96 | Validation | 11 | 5 |

200 | 18.82 | 15.28 | 0.97 | 0.97 | 19.14 | 0.97 | Validation | 13 | 7 |

250 | 19.68 | 40.41 | 0.97 | 0.93 | 20.73 | 0.96 | Validation | 9 | 3 |

300 | 17.4 | 25.72 | 0.97 | 0.96 | 19 | 0.97 | Validation | 10 | 4 |

Each variable has certain effect on the output, some has huge effect (main variables) while others may have little effect if at all. Here we will examine the four variables (AT, V, AP and RH), which makes 15 different combinations. Each is repeated for 10 times as to overcome problem of randomness discussed in section (3.1). These settings are shown at

From the results shown at

In all previous sections, we used Levenberg-Marquardt algorithm (trainlm) function. Here we will examine and compare another famous training function, Bayesian regularization (trainbr), which is an improved algorithm to the former one. Setting is shown at

From the result

Here dataset is normalized to the range (0.01, 0.99) and the quality of the resulted network is compared to network trained with not normalized dataset, which is normalized by Matlab Neural Network toolbox; which has two options for normalization. The first is the standard normalization to the range of (−1, 1), using the function (mapminmax) which is the default in the toolbox. Secondly, is the normalization to a range with specified mean (typically 0) and standard variation (typically 1), using the function (mapstd). Settings for this experiment are shown at

From the result

Dataset size | 50 |
---|---|

Variables used | AT and/or V and/or AP and/or RH = 15 different combinations |

Hidden layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt (trainlm) |

Variables | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch |
---|---|---|---|---|---|---|---|---|---|

AT | 28.88 | 25.95 | 0.95 | 0.96 | 29 | 0.95 | Validation | 11 | 5 |

AT, V | 11.93 | 56.69 | 0.98 | 0.96 | 27.35 | 0.95 | Validation | 8 | 2 |

AT, V, AP | 6.75 | 65.79 | 0.99 | 0.9 | 35.65 | 0.94 | Validation | 11 | 5 |

AT, V, AP, RH | 21.02 | 41.98 | 0.97 | 0.96 | 34.31 | 0.94 | Validation | 9 | 3 |

AT, V, RH | 13.18 | 74.47 | 0.98 | 0.94 | 26.88 | 0.96 | Validation | 12 | 6 |

AT, AP | 21.46 | 71.32 | 0.97 | 0.91 | 33.22 | 0.95 | Validation | 10 | 4 |

AT, AP, RH | 6.1 | 71.37 | 0.99 | 0.89 | 31.86 | 0.95 | Validation | 12 | 6 |

AT, RH | 38.72 | 20.2 | 0.95 | 0.95 | 31.38 | 0.95 | Validation | 11 | 5 |

V | 61.72 | 39.81 | 0.91 | 0.93 | 69.41 | 0.88 | Validation | 10 | 4 |

V, AP | 18.46 | 136.06 | 0.96 | 0.85 | 66 | 0.88 | Validation | 15 | 9 |

V, AP, RH | 35.73 | 88.24 | 0.94 | 0.9 | 69.86 | 0.87 | Validation | 12 | 6 |

V, RH | 70.55 | 81.22 | 0.9 | 0.94 | 67.49 | 0.88 | Validation | 12 | 6 |

AP | 241.5 | 215.07 | 0.48 | 0.44 | 222.95 | 0.49 | Validation | 10 | 4 |

AP, RH | 167.39 | 150.46 | 0.7 | 0.67 | 212.23 | 0.56 | Validation | 9 | 3 |

RH | 223.65 | 281.37 | 0.55 | 0.43 | 281.45 | 0.35 | Validation | 9 | 3 |

Dataset size | 50 |
---|---|

Variables used: | AT, V, AP and RH |

Hidden layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt (Trainlm); Bayesian regularization (Trainbr) |

Used Function | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch |
---|---|---|---|---|---|---|---|---|---|

Trainlm | 39.79 | 12.86 | 0.95 | 0.98 | 35.84 | 0.95 | Validation | 8 | 2 |

Trainbr | 29.38 | - | 0.96 | - | 25.04 | 0.96 | Max mu. | 99 | 37 |

Dataset size | 50 |
---|---|

Variables used | AT, V, AP and RH |

Normalization method | 1. mapminmax normalization (default), |

2. mapstd normalization, | |

3. normalized to the range (0.01, 0.99) | |

Hidden layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt algorithm (trainlm) |

Dataset used | Training Performance | Validation Performance | Training Regression coefficient | Validation Regression coefficient | ||||
---|---|---|---|---|---|---|---|---|

mapminmax | 26.11 | 20.02 | 0.96 | 0.98 | ||||

mapstd | 24.81 | 26.04 | 0.96 | 0.97 | ||||

Normalized to range (0.01, 0.99) | 0.01 | 0 | 0.96 | 0.97 | ||||

Dataset used | Test Performance | Test Regression coefficient | Stopping Criteria | #Epochs | Best Epoch | |||

mapminmax | 30.33 | 0.96 | Validation | 8 | 2 | |||

mapstd | 29.56 | 0.96 | Validation | 8 | 2 | |||

Normalized to range (0.01, 0.99) | 0 | 0.96 | Validation | 9 | 3 | |||

Here we consider the two groups of resulted outputs.

1) Training & Validation group: outputs resulted from the network for the input data used in training and validation. This group gives a sense of the validity of the model.

2) Test (Complete dataset) group: outputs resulted from the network for the test input data, which in our study is the complete dataset. This group gives success measure for the network.

Comparisons are based on network with the settings and sub dataset size shown at

General comparison is presented visually in

Furthermore, comparison of error for the two groups is also shown in

dataset size | 250 |
---|---|

Variables used | AT, V, AP and RH |

Hidden Layer size (#neurons) | 10 |

Training Function | Levenberg-Marquardt (Trainlm) |

Group | Train & Validation | Test (Complete Dataset) |
---|---|---|

Group size | 250 | 9568 |

Performance MSE (MW)^{2} | 17.942 | 18.682 |

Mean error µ (MW) | 0.1108 | 0.4564 |

Standard deviation of error σ (MW) | 4.2429 | 4.2825 |

Min error (Negative) (MW) | −25.7167 | −43.1750 |

Max error (Positive) (MW) | 10.9414 | 20.6876 |

To compare the amount of instances vs. error between the two groups, it is more convenient to compare the normalized amount of instances, i.e. percentage of the group. This is shown in

If the error sign (Positive or Negative) is to be neglected, as we want to describe how close the group results to its target values, we can make the same chart but with absolute values. This is presented in

At this case, train and validation group has more percentage of its instances closer to zero mean error. It provides us with the same info we extracted from

As experiment, 20 data points are selected randomly and tested, results and errors are shown in

Regression artificial neural networks (ANN) is used to model electrical output power (EP) of combined cycle power plant based on four inputs. Data are collected from published work freely available online. MATLAB neural networks toolbox is used to program the ANN model. The ANN model is applied and studied through experimenting various settings effects on the neural network performance. Total seven experiments are applied.

Results show the randomness of the ANN model performance for each time it trained, this is because of the randomness of the initial values of weights and bias. It is also observed that increasing in number of neurons at the hidden layer does not necessarily lead to increased quality of the model; in fact, number of neurons has an oscillating effect on the model performance. Increasing dataset size (more data points for the same variables) provides better networks for some extend. Increasing the number of input variables does not always lead to better network quality; some variables when introduced reduce the quality of the model, others increase it. It has to be studied through correlation between variables themselves and between variables and output. In addition, different training functions are compared for the same setting and dataset; in this work Bayesian regularization performed better than Levenberg-Marquardt algorithm. Dataset normalization methods provided by the toolbox are also experimented.

Lastly, results are compared with target values of output for the train and validation group and for the test group, which is the complete dataset group. Comparison shows that results are very close to target outputs for both groups. In addition, it shows the normal distribution of error among each group with mean value of zero. The standard deviations of the error at the two groups are almost equal.

The authors declare no conflicts of interest regarding the publication of this paper.

Elfaki, E.A. and Ahmed, A.H. (2018) Prediction of Electrical Output Power of Combined Cycle Power Plant Using Regression ANN Model. Journal of Power and Energy Engineering, 6, 17-38. https://doi.org/10.4236/jpee.2018.612002