1. Introduction
Multiparticle production is an essential entity in high-energy proton-proton collisions. The hadron-hadron (hh) observables like charged particle multiplicity “
” and pseudorapidity density “
” are essential key to characterize the properties of matter created in proton-proton (pp) collisions [1] [2] , where,
(the pseudorapidity)
and
are the polar angle with the beam
axis. The dependence of these observables on collision energy (center-of-mass energy “
”) and the collision geometry are a key tool to understand the underlying particle production mechanism [3] [4] [5] . The investigation of these observables has been used to improve, or reject, models of particle production which are often available as Monte Carlo event generators [6] [7] [8] [9] . The charged particles multiplicity is the simplest observables to understanding of multi-particle production in collisions of hadrons at high-energy [6] [7] [8] [9] .
The charged particle multiplicity distributions
provide an indispensable tool in the investigation of the dynamics of multi-particle production processes. Their measurements form an important part of the “hh” collision experimental activity. Some new experimental information on the multi-particle production has been reported in the recent past [6] . Consequently, a lot of efforts have also been put forward to analyze and/or organize the experimental data by using various theoretical as well as phenomenological schemes [5] - [12] .
There are several models (empirical models and deterministic models) attempt to describe the multiplicity distributions [5] [6] [7] . The first step towards a successful understanding of the multiplicity distributions was done by Feynman in 1969 [3] . Moving to higher energies, deviations from those first models were observed, and the
data were described using single Negative Binomial Distributions (NBD) [4] , which successfully describe
in full phase up to
= 540 GeV as in different
-intervals [4] . But there is a deviation of
from NBD for large
-intervals for
= 900 GeV, but Giovannini and Ugoccioni [1] [2] describe the measured
) for
= 900 GeV by the combination of the two weighted NBD [4] . However, this issue is still an open question of interest from the point of view of theoretical and experimental physicists.
To alleviate this problem, we have developed a black-box modeling methodology based on applying the artificial neural network (ANN) approach [13] . ANN Black-box models are powerful and promising tools for complex system modeling. Utilization of an ANN model is, in general, highly suitable for simulating the nonlinear behavior of charged particles multiplicity distributions in proton-proton interactions. This is due to the formulations of neural network models being based on nonlinear functions and having a flexible mathematical structure. In recent years, there has been an increasing amount of applications of ANN models in the field of high energy physics (HEP) [14] - [19] .
The most commonly used Neural Network model is the Back-propagation (BP) Network [13] , which is a multi-layer feed-forward network trained according to error back propagation algorithm. BP network can be used to learn and discover a mathematical equation that mapping the relation of input-output model [13] . The disadvantage with multi-layer feed-forward networks using error back propagation is that the best number of hidden layers and units varies from task to task and so must be determined manually through trials and errors. One approach to automatically determine a good size for a network is to start with a minimal network and then add hidden units and connections as required like the Cascade Correlation Neural Networks (CCNN) [20] - [25] . CCNN have several advantages over the ANN, such as they are self organized (i.e. built automatically), less computation cost and complexity (can be obtained with little adjusting parameters) and the training is very fast.
The objective of this paper is to develop a mathematical model based on CCNN approach to calculate and predict the charged particle multiplicity distributions (
and the energy dependence of the average multiplicity for
inelastic scattering. The CCNN approach learns based on experimental data for full phase space collected from several collaborations [26] - [34] , to discover
as a nonlinear response function represented by the network parameters. The
is calculated and predicted by the discovered nonlinear function that representing the CCNN-model, as well as, the energy dependence of the average multiplicity
for a wide range of energies is calculated and predicted. The obtained results are compared with the ones from different theoretical models such as Dynamical Gluon Mass (DGM) model [35] [36] .
The paper is organized as follows: Details of the CCNN black-box model for PP and
multiplicity distribution are described In Section 2. The results obtained are presented in Section 3. Finally, the main conclusions of this study are formulated in Section 4.
2. CCNN Black-Box Model for
and
Developing a mathematical model that can accurately describe the physical behavior of the complex physical problem is a challenging task. Meanwhile, neural networks are a very promising tool for empirical “black-box” modeling of complex systems without going into mathematical details. An artificial neural network (ANN) is a nonlinear empirical model that inspired on the biological neural networks [13] . ANN Black-box models do not need detailed prior knowledge of the structure and different interactions that exist between important variables of the nonlinear system that under investigation. Therefore, ANN is a powerful and promising tool for complex system modeling. ANN can be trained with the Cascade-Correlation (CC) learning method to “learn” complex dynamic behaviors of physical systems. A CCNN acts as a black box and learns to predict the value of specific output variables given sufficient input information. The cascade correlation neural network is capable of global function approximation, i.e. it represents a function in a whole data set [20] - [25] .
In this paper, we explore the use of CCNN for developing mathematical black- box modeling from experimental data
collisions. In the following subsection, we will give a brief introduction to the CCNN approach.
2.1. Overview on CCNN Approach
Artificial neural networks (ANNs) are classified as intelligent computing systems because of their ability to learn. All Artificial neural networks were inspired by the human brain. ANNs consist of artificial neurons connected with each other, and they are termed as nodes. Each neuron has group of inputs, outputs and a transfer function. The mathematical model of a neuron can be described by the equation
(1)
where y is the output value, xk is the kth input, wk is the weight of the connection related to the kth input and f is the transfer function which is usually the radial basis function or the sigmoid function [24] [25] .
It’s known that a feed forward neural network (FFANN) with one hidden layer is an universal function approximator, so it can approximate any nonlinear function with arbitrary precision. Furthermore, any FFANN can be trained (in the supervised way) by the BP algorithm. The BP algorithm calculates the gradient of the network according to the synaptic weights [13] .
The main problem in ANN is the designing of the network with the appropriate number of hidden layers and their units to learn a given concept. If a network has too few hidden units, it will not have the computational power to learn the concept well. Given too many hidden units it will over-fit the training dataset and generalize poorly to new examples that not included in the training data. The CC approach which constructs neural network from bottom to top was proposed by “Fahlman and Lebiere, 1990” [24] in order to solve the problem of low convergence speed of traditional BP, the local minima problem, the step-size problem, the moving target program on and to avoid having to define the number of hidden nodes in advance.
The cascade-correlation architecture supports a variety of learning algorithms, One of the most robust back-propagation variant, called “Quick prop”, was published by Fahlman (1998) [25] .
At first the learning algorithm begins with a minimal network (input/output units without hidden unit). The output layer weight was adjusted by the gradient descent algorithm. The error of the network was measure, if the network’s performance was not satisfactory, generate and train a candidate unit.
This candidate neuron is trained by maximizing the magnitude of the correlation between the candidate’s output and the error term to be minimized. Gradient descent is used to minimize the network’s output error, while a gradient ascent is employed to maximize the correlation.
By maximizing the correlation C between the candidate’s output and the network output. Once a neuron is finally added to the network (activated), its input connections become frozen and do not change anymore. Train the network (input/output/hidden unites) until the residual error of the network is minimized (minimize the overall error of the net). This process of optimizing the output weights, creating a hidden neuron, optimizing the hidden neuron weights, connecting it to the output neurons, and adjusting the output neuron weights is repeated until an acceptably small error is produced or a maximum number of nodes are reached. The following lines summarize the main steps of the CCNN algorithm.
The Cascade Correlation algorithm cycles through two phases an output phase in which weights entering units are trained in order to reduce network error, and an input phase in which weights entering candidate recruits are trained in order to correlate with network error [23] [24] [25] . The connection weights should be adjusted in the two phases to maximize the correlation and minimize the network error:
In the first phase:
・ Initialize the CCNN network (2 layers)
・ Calculate the actual output
・ The output weights are adjusted until no further progress is made using quick propagation (QuckProp).
Minimize the error (-ve gradient descent of the gradient
) where
is the observed value of the output for training pattern output and top is the desired output value.
In the second phase:
・ Add candidate.
・ Initialize (weights and learning constant).
・ Calculate its output.
・ Train candidate to maximize C (by gradient method QuickProp) by “+ve gradient ascent”
・ Calculate the correlation between the candidate unite and the residual error of the network.
, where
and
are the average value of candidate hidden units output
and the original network’s output units residual error
overall the training samples.
When C reach Max
it weights freeze
・ Add to the main net
We use the QuickProb algorithm to compute and update the network “w” the iteration t
(2)
(3)
where, S is the derivative of the function being optimized (E in the case of the output phase should be minimized, C for the input phase should be maximized)
Weight change computed by:
(4)
・ The first phase is started again to train the main net output.
These two phases are repeated until either the training pattern has been learned to a predefined level of acceptance or a preset maximum number of hidden units have been added, whichever occurs first. For more details see Ref. [22] [23] [24] [25] . The following subsections discuss the development of CCNN model based on the collected experimental data (which collected from many hadron collider experiments [26] - [34] ).
2.2.
and
Model Development
The objective of this paper is to develop a mathematical model based on CCNN approach to calculate and predict for
scattering. The mathematical model is based on numerous experiments conducted on different Labs [26] - [34] , and a neural network approximate method which is employed to predict or extrapolate the experimental results.
To train and test the proposed model, the CCNN program code was developed by using MATLAB language [The Math works Inc. USA]. CCNN has the disadvantage of over-fitting the training data. Due to this, the accuracy values are quite high in case of training data, but low in testing data. So, for preventing the over-fitting of the training data the CCNN model is validated as it grows using the 3-fold cross validation.
To compute the performance of CCNN model, we have examined the performance indices (
and RMSE) until no significant improvement occurred. Once the training is complete, the CCNN model would have learned to approximate
and
i.e. reproduce, interpolate and extrapolate the data that are not included in the training data.
3. Results and Discussion
In this section, we have applied the CCNN model to calculate and predict
and
using the available experimental data [26] - [34] .
In the present CCNN, we have obtained R2 = 0.998 and RMSE = 0.000137. The following network training parameters are used: Minimum neurons in hidden layer: 2; Maximum neurons in hidden layer: 200; Hidden neuron kernel function: Gaussian, Output neuron kernel function: Linear and Over-fitting protection control = 3-fold cross-validation.
In this regard, we have modeled the
at a wide range of available experimental data for center-of-mass energy:
= 30.4 GeV, 44.5 GeV, 52.6 GeV, 62.2 GeV, 200 GeV, 300 GeV, 540 GeV, 900 GeV, 1000 GeV, 1800 GeV, and 7 TeV (From ISR energies in the 1970’s to the highest LHC for
scattering). We have compared the obtained results with the recently published experimental, empirical and/or phenomenological results. We also, provide predictions of the
in pp collisions at the full LHC energies (14 TeV).
Figures 1(a)-1(c) shows our calculated and predicted results of
as a
(a)(b)(c)
Figure 1. (a)-(c). Multiplicity distribution
of charged particles in pp and
inelastic collisions at 30.4 GeV ≤
≤ 7 TeV. Figure 1(c) (Bottom-Right panel) prediction of
for pp at
= 14 TeV. ___DGM model, * CCNN model (our model) and o experimental data of the five Collaborations [26] - [34] .
function of
and
. Also, this figure shows the comparison between our calculated and predicted
values and the other theoretical and experimental values [25] - [34] . In this comparison, our model results show closer agree- ment with the experimental data and the theoretical ones. Figures 1(a)-1(c) emonstrates that, the predicted
spectra values are very close to the actual values (experimental data) which indicates that CCNN can be used as an effective tool for modeling the
based on the
and
.
Figure 1(c) (Bottom-Right panel) manifests the prediction of multiplicity distribution of the produced particles at LHC energy (
= 14 TeV) which is compared with those distributions obtained by other models [26] - [34] . According to our CCNN model, the prediction of
at 14 TeV having the same trend as the theoretical one [35] [36] . In addition, the
energy dependence was modeled (calculated and predicted at a wide range of
) (from 30 GeV to 7 TeV) and as well as predicted at the highest LHC energy (14 TeV).
The probability of production of particles decreases with the increase of
as well as shifted towards the increase of
. Also, we notice that the width of the distribution is broadened with the increase of
as shown in Figure 2. This figure shows the multiplicity distributions
of charged particles in pp and
collisions “full phase space” from 30 GeV to 14 TeV.
Figure 2. Multiplicity distributions
of charged particles in pp and
collisions from 30 GeV to 14 TeV “full phase space”.
Based on the proposed CCNN model, the values of energy dependence of the
average charged multiplicity in pp collision are calculated (
)
and compared with corresponding experimental and theoretical results. Figure 3 shows the energy dependence of the average charged multiplicity in pp collision for
ranging from 30 Gev to 14 TeV. The calculated values are compared with the corresponding experimental and theoretical results. Also, from Figure 3 we notice that
increases with the increase of
which shows the same trend as the experiment [26] - [34] . The results of the present open the route into applying modern soft-computing procedures such as neural network into the modeling of HEP.
4. Conclusions
The charged-particle multiplicity belongs to the simplest observable that provides important insights into the mechanisms of particle production. In the present work we have used CCNN network for modeling the multiplicity distribution of charged particles produced in pp and
interactions at Center-of-mass energy
= 30.4 GeV,44.5 GeV, 52.6 GeV, 62.2 GeV, 200 GeV, 300 GeV, 540 GeV, 900 GeV, 1000 GeV, 1800 GeV, and 7 TeV. In this regard, we have developed the CCNN mathematical black-box models to calculate and predict the multiplicity distribution of charged particles
produced in proton-proton collisions as a function of
and
, as well as the energy dependence the energy dependence of average multiplicity
. The results indicate that the proposed CCNN model shows a good correspondence between
Figure 3. Shows the comparison between our calculated and predicted values for the energy dependence of the average charged multiplicity and the corresponding experimental and theoretical data [25] - [34] .
the experimental data and our calculated results according to the statistical performance. We have also compared our results for
and
with the models that based on Monte Carlo model, which successfully explains multiplicity distribution. In addition, the predictions for
and
of charged particles in pp interactions at
= 14 TeV are found to be in agreement with Dynamical Gluon Mass model [35] [36] . The obtained results confirm the reliability of our model and will encourage physicists to apply other ANN techniques to calculate and predict other problems in mutliparticle production investigation.
Acknowledgements
The author would like to thank Prof. Dr. M Y. El-Bakry, Faculty of Education, and Prof. Dr. S.Y. El-Bakry, Faculty of Science, Ain Shams University for the useful discussion on the problems considered in this article.