Artificial Neural Network Model for Friction Factor Prediction

Friction factor estimation is essential in fluid flow in pipes calculations. The Colebrook equation, which is a referential standard for its estimation, is implicit in friction factor, f. This implies that f can only be obtained via iterative solution. Sequel to this, explicit approximations of the Colebrook equation developed using analytical approaches have been proposed. A shift in paradigm is the application of artificial intelligence in the area of fluid flow. The use of artificial neural network, an artificial intelligence technique for prediction of friction factor was investigated in this study. The network having a 2-30-30-1 topology was trained using the Levenberg-Marquardt back propagation algorithm. The inputs to the network consisted of 60,000 dataset of Reynolds number and relative roughness which were transformed to logarithmic scales. The performance evaluation of the model gives rise to a mean square error value of 2.456 × 10−15 and a relative error of not more than 0.004%. The error indices are less than those of previously developed neural network models and a vast majority of the non neural networks are based on explicit analytical approximations of the Colebrook equation.


Introduction
A fully developed fluid flow in a pipe of circular geometry is usually accompanied by a head or energy loss. This energy loss results from friction or shear resistance between the fluid layers and pipe walls. The Darcy-Weisbach [1] formula given by Equation (1) is commonly used to estimate the head loss based on the velocity of the fluid and the resistance due to friction. 2 The evaluation of Equation (1) requires the friction factor to be known. In the laminar flow regime for both smooth and rough pipes, it is a function of the Reynolds number (Re) only. For turbulent flow in smooth pipes, f is a function of Reynolds only and is evaluated using the Prandtl-Von Karman [1] relation expressed in Equation (2) ( ) 10 10 Re 0.8 2 lo 1 g 2 2 log Re .51 while for turbulent flow in rough pipes, Von-Karman relation [1] given in Equation (3) holds 10 2 log 4 2 Colebrook [2] combined the above equations to arrive at his famous equation given in Equation (4). What he did was reported by Brkić [3] to be in violation of the mathematical laws of logarithm. However, the Colebrook equation has been a reference standard for friction factor prediction.
Friction factor, f, has to be known and can be determined from the Colebrook equation. This equation is, however, implicit in relation to the friction factor. Therefore, its resolution requires an iterative solution.
For calculations involving small data-sets, the iteratively solved Colebrook equation will generally suffice [4]. In industrial and engineering applications, large flow networks are essential to achieve continuous transport of material from different processing units. For the simulation of long pipes and networks of pipes, the Colebrook equation must be solved a huge number of times [5]. Therefore, an iterative solution to the implicit Colebrook equation is time consuming. The resolution of the implicit Colebrook equation has been of interest to researchers over the past seven decades. The numerous explicit approximations which abound in the literature attest to this fact. These approximations were developed using various conventional analytical approaches. They, however, vary in their accuracy and complexity, with the more complex ones having the capability to predict more accurately [6]. The performances of these approximations in relation to the Colebrook equation are well discussed in the literature [3] [6]- [9].
Besides the conventional analytical approach to solving engineering problems, artificial intelligence techniques can be applied as they have in recent time matured to a point of offering practical benefits in many of their applications. Artificial intelligence refers to the ability to mimic/replicate the human behavior/reasoning into machines and software using cutting edge techniques. One of such techniques is the artificial neural network (ANN), an intelligent data-driven modeling tool which is able to capture and represent complex and non-linear input/output relationships from a set of examples [10]. Nowadays, artificial neural networks are being applied to a lot of real world problems, such as functional prediction/approximation, system modeling (where physical processes are not well understood or are largely complex), pattern recognition, etc., with the ability to generalize while making decisions about imprecise input data [11].
The applications of artificial neural network to fluid flow for friction factor prediction have been reported [10] [12] [13]. The performance of the models developed is, however, low when compared with their non ANN based explicit counterparts. Previous studies have used datasets up to a maximum of 3000 for developing these models. This factor may have contributed to the low performances reported for these models. The Colebrook equation is non-linear and has a wide range of application i.e. 4000 ≤ Re ≤ 10 8 , thus a small dataset will be insufficient to learn the input-output relationship across such range and produce a network which is able to predict friction factor with high accuracy. Brkić and Cojbasić [14] have developed an ANN model for predicting friction factor, a total of 90,000 dataset were used for a 2-50-1 network topology, and this gave rise to accuracy of 0.07% which was in close agreement with those of more accurate non ANN-based explicit friction factor models. As relationships between inputs and outputs are enriched, approximation capacity improves [11].
This paper is aimed at developing an artificial neural network model for predicting friction factor using a multi layer perceptron network. The remaining sections of this paper are organized as follows: Section 2 provides an overview of artificial neural network. In section 3, the development of the proposed model is presented and the performance of the proposed model in comparison with those of the selected ANNs and non ANN-based explicit models is evaluated. In the final section, relevant conclusions are drawn based on the results obtained in this study.

Artificial Neural Network: Overview
Artificial neural network is the generalization model of biological nervous system [15]. It is in essence an attempt to simulate the human brain. Thus, it is a modeling tool which is able to learn complex and non-linear input-output relationships and reproduce same from a given set of examples by the use of neurons. It requires no prior knowledge of the mechanism/principles or background underlying the process to be modelled.
Ideally, a neural network consists of three distinct layers: input, hidden and output layers. The ability of an ANN to learn and approximate relationships between input and output is largely dependent on the size and complexity of the problem. The multi layer perceptron (MLP) is the most common amongst the types of ANN in which the data processing extends over more than one hidden layer. The hidden layer in MLP consists of layers of neurons.
The ability of an ANN to learn input-output relationship depends largely on the number of datasets used in training the network. A sufficiently large amount of datasets will enable the network learn accurately the input-output relationship of any given process [9]. The results obtained from the work of Shaya and Sablani [12], Yazdi and Bardi [13], Fadare and Ofidhe [10] and Brkić and Cojbasić [14] lend credence to this.
Prior to training of the network, the dataset are brought to the same order of magnitude (pre-processed). There are several normalization techniques but the most common are min-max, z-score and scaling. Min-max and its variant forms have been applied [10] [13]. The friction factor f, in the Colebrook equation has a logarithmic relationship with the Re and (ε/D), thus the adopted practice of taking the logarithmic transforms of the input parameters [12] [14].
When training neural networks, calculus based search techniques such as back propagation are often used. Such search techniques are possible because the search space have been transformed (normalized) so that it is smooth [16]. When the search has succeeded, the neural network is said to have been trained, i.e. a suitable combination of connecting weights has been found. Details concerning the working of ANN abound in the literature.

Generation of Input-Output Dataset
The dataset used for developing the ANN model were obtained from the iterative solution of the implicit Colebrook equation using Microsoft Excel software. The dataset consisted of 60,000 values each of Reynolds number (4000 ≤ Re ≤ 10 8 ), relative roughness (10 −6 ≤ (ε/D) ≤ 0.05) and friction factor, f. Reynolds number (Re) and ε/D are inputs to the network and f is as the target (the output variable) to be predicted by the network. These sufficiently large amounts of data set were used because the network may not be able to correctly learn the input-output relationship from a few dataset [14]. Although the training time maybe very long for a large data sets but will not affect the speed with which the network will predict.

Data Pre-Processing
Normalization helps to bring the data sets (input parameter) within the same order of magnitude. Several methods of normalization techniques exist, the z-score, min-max and the scaling. However, the min-max and its variant forms have been applied references [10] and [13]. The normalization approach used in this study is the same as used in [14]. In this approach, the input datasets were brought to a logarithmic scale by taking the logarithm to base 10 of Re, log 10 (Re) and negative logarithm to base 10 of (ε/D), −log 10 (ε/D), thus bringing the datasets to a range of 1.3 -8. The friction factor term in the Colebrook equation shows that the Reynolds number and the relative roughness have logarithmic relationships; hence, most researchers have adopted the logarithmic forms of the Reynolds number and relative roughness.

The Proposed ANN Model
This work used the feed-forward (FF), multi layer perceptron network (MLP). Given the non-linearity in the Colebrook equation, the MLP network with non-linear transform functions could minimize the mean square error as performance determination [13]. The hidden layer consists of 2 layers with a predefined amount of neu-rons. A typical neural network structure with two hidden layers is as shown in Figure 1. The Re and ε/D are the inputs to the network while f is the output parameter. The "tansig" function was used as a non-linear transfer function in the hidden layers while the "purelin" function was used for the output layer.

Training the Network
The network was trained using the Levenberg-Marquardt back propagation algorithm. Besides, back propagation algorithms are also suitable for normalized data set because it is smooth and make the training easier [11]. By default, the MATLAB software partitions the data set in to 3 sets: the training data set (75%), test data set (15%) and validation data set (15%). The training dataset was used to do the training. The stopping criteria are usually determined by the preset error indices (such as mean square error, MSE) or when the number of epochs reaches 1000 (default setting). Several network topologies with two hidden layers were trained. However for this study, the number of epochs was set at 3000. The training process was truncated at only 51 epochs for a 2-30-30-1 network topology, having reached an MSE value of 2.456 ×10 −15 .

Performance Evaluation of the Model
The default performance index used by MATLAB for training the ANN model is mean square error (MSE).
This index distributes the total error over the data points considered. However, it does not give any information about the performance of each individual data point. Figure 2 shows the performance of the training, testing and validation datasets compared with the Colebrook equation. The curves for these datasets are well superimposed. This is an indication that the ANN model is able to generalize with high accuracy.
To ascertain the robustness of this model, the relative error given in Equation (6) was also determined.

colebrook ANN colebrook
Relative error 100 Table 1 gives a detailed error index of the ANN friction factor model for this study compared with the existing ANN friction models. The performance of the ANN model for this study based on relative error shows that it has error not exceeding 0.004%. This is a significantly high accuracy and compared with other ANN friction factor models, it ranks best. It can be seen from Table 1 that the models trained with a significantly large amount of dataset out performs those trained with a few datasets. In the study conducted by Brkić and Cojbasić [14], a total of 90,000 datasets were used to develop a single layer ANN model with 50 neurons in the hidden layer. This resulted in an ANN model with relative error of not more than 0.07%. However, in this study, a total of 60,000 datasets with two hidden layers consisting of 30 neurons each were used. The accuracy of this model is 94% higher than the ANN model developed by Brkić and Cojbasić [14] which compensates for the additional neurons and hidden layers used in the model development for this study.
The performance of the ANN model proposed in this study was also compared with those of the extremely accurate non ANN-based explicit models, based on maximal relative and absolute errors criteria as shown in Table 2. The present ANN model ranks amongst the top four (4) most accurate friction factor models and can therefore conveniently be used in preference to other explicit approximations of the Colebrook equation.     (2) in [17]. b Equation (3) in [17]. c Equation (2) in [18]. d Equation (6) in [18]. e Equation (2) in [19]. f Equation (4) in [19].

Conclusion
A new ANN friction factor model is developed. The multi layer perceprton neural network with two hidden layers having 30 neurons each has relative error up to a maximum of 0.004% when compared with the Colebrook equation. When compared with the available non ANN-based explicit and existing ANN friction factor models, it outperforms the existing ANN models and compares well with the most accurate non ANN-based explicit models. Thus, the use of ANN for friction factor prediction has shown to be very successful based on the accuracy level obtained in this study. The accuracy of this model notwithstanding may not have been optimal. The performance of an ANN is sensitive to parameters such as the network topology, learning rate and the weight and bias. The optimal combination of these parameters can be found using evolutionary techniques such as genetic algorithm which have good global search ability. This will give the advantage of memory conservation and reduce the number of unnecessary parameters.