On the Choice of Parameter Values for Simulation Based Experiments on Mobile Ad Hoc Networks

Over the course of the last 10 years, research on Mobile Ad Hoc Networks has witnessed high levels of focus. Because of this field being a relatively new topic, most researchers have focused on using simulation as the first choice for performance measurement and evaluation. In order to contribute to the credibility and reliability of research results on Ad Hoc networks, this paper addresses the issue of choosing parameter values that are used as input for simulators used in such experiments by surveying a sample of recently published papers on Ad Hoc Networks. Results show that in the majority of revised papers, the simulation environment is not carefully planned as reflected by the diversity and quality of parameters used.


Introduction
Choosing the parameter values is one of the most essential steps of the experiment setup and design stage of any research [1].The "correctness" of the values selected is a relatively broad term.For instance, a correct parameter could be defined as the more realistic one.On the other hand, a correct parameter set could be the group of parameters that accomplish the goal of simulation with the minimum cost [2].
In the field of MANETs, the sparse and none fully-explored space of parameter values can easily cause any researcher to incorrectly assign values to parameters used as input to the simulator.Moreover, the interrelations between different factors that affect the network performance are not fully understood.Moreover, the full list of factors is not fully discovered [3].For example, when choosing the network size expressed in number of nodes, the researcher cannot certainly report that the selected number of nodes is suitable for the problem addressed.Moreover, the selected number of node is not definitely going to affect the results of simulation and, even if it does, the effect is neither predictable nor correctly related to other simulation parameter.
It is an extra step that needs to be taken by the researcher to justify the choices of parameters made.Manaseer [4] has shown some justification for the choice of simulation time and transmission range in relation to network area.

Related Wok
There have been many surveys in the last decade.Cavin et al. [2] have reported some issue related to the accuracy of simulations used in studying some flooding algorithm in MANETs.This paper has pointed out some issues related to the parameters.In 2005, Stuart Kurkowski et al. [1] performed a survey on a population of papers published between 2000 and 2005.Many pitfalls have been reported and they even are repeated in this work which is 10 years after the last survey was held.According to [1], 15% had repeated work and only as little as 12% mention the simulator version used.The survey also has reported that only 7% of the surveyed papers have addressed some special issues such as initialization bias or random number generator issues.
The major observation at this point is that the recommendations and conclusions of survey papers, such as this one, are not being implemented when new research is conducted.Therefore, these surveys should continue.

Realistic vs. Hypothetical
Simulation as a tool has the advantage of testing a wide range of scenarios [5], including the ones that do not exist in real life.However, this ability is accompanied by major downsides.In many cases, researchers tend to choose the setup in a way that serves testing a new protocol without taking real life scenarios into account.It is relatively easy to build a none-realistic scenario that has no real-life counterpart represented by the chosen parameter set.Although this scenario can be used to test any protocol and help researchers to measure the performance of any new proposed solution, the main point of a new solution is to be used in real life applications in order for it to be of use.

Relation between Parameters and Research Problem
One of the most common mistakes in simulation based experiments is addressing irrelevant parameters [6].Generating different scenarios via altering the values of irrelevant parameters, which do not affect the problem for which the simulation is designed, is a main source of redundancy in the total number of runs for this simulator.Therefore it is important to address only the parameters that are either directly or indirectly related to the research problem.

Interrelation between Parameters
Computer networks have a large number of factors that might affect the overall performance levels.It is necessary to understand the interaction between these factors and to take this into account when designing the simulation scenarios.Moreover, special care should be taken when varying more than one parameter to ensure that the values chosen do not contradict with the relation that binds the two factors together if any [4].

Performance Measurement
In most of the available simulators, results are recorded and trace files [7].Most of the trace files provide adequate level of details.However, it is essential to ensure that the details provided in the trace files enable the extraction of the intended performance measurements criteria that, in turn, have to be pre decided before conducting simulation.

Justification of Choices Made
Finally, to gain a higher level of credibility for any conducted simulation based research, it is necessary to provide an adequate level of justification for the values chosen into different scenarios used with the simulation.It is also a common mistake to follow a trial and error method to decide the best scenarios that have the highest level of performance.Using this approach can easily lead to presenting results for scenarios that do not actually reflect any realistic Applications.

Survey
In this section, a collection of published research articles are surveyed to investigate the points mentioned above in order to provide future research with a preliminary guide to produce more reliable research findings.

Description
In order to perform this survey, more than 190 research articles have been reviewed and summarized.The articles have been published over the last five years and they are published in peer reviewed journals.The full list of papers is available upon request.Review process focused on the simulation parameters used and the method of presentation the researchers used to show them.These papers have been classified into two groups; the first group contains the papers with no parameters provided.For the second group, the parameters having summarized and filtered to choose the set of most common parameters addressed.

Average Number of Parameters Used
The scenarios described in the papers of the study population have used widely variable topologies in which the number of parameters ranges between a few of parameters to a long redundant list.Figure 1 shows the average number of scenario parameters reported by the papers against the number of the occurrences.
As seen in the figure, the majority of papers have reported between 6 -10 parameters when describing the scenarios.Figure 1 shows the number of parameters only and not the number of values per parameter.The greater numbers of parameters indicate that the paper is addressing very specific case to implement any proposed work.However, it is worth mentioning that listing small number of parameters leads to weak future comparisons with the proposed work while, on the other hand, an extremely large list of parameters leads to confusion when the audience try to understand the exact focus and scope of the proposed work.Using small number of parameters indicates that the authors of the paper focus on the proposed technique and its representation more than the implementation.The unsuitable number of parameters indicated lack of understanding of the simulation environment and its requirements.

Most Used Parameters (Top 10)
Literature does not provide a manual that instructs the use of a predefined list of parameters in simulations.According to the sample used in this survey, researchers focus on using some parameter more frequently than other parameters.Figure 2 displays the top 10 parameters used along with their usage percentages.
It is worth mentioning that the focus on a certain parameter does not necessarily mean that this parameter is suitable for use with most scenarios.It might simple mean that researchers copy the values used in literature.

Number of Values Per Parameter
Due to the sparse nature of MANETs, researchers tend to test using wide range of values per parameter.Table 1 shows that the distinct values of parameters are limited only for the parameters that have limited space of   possible values.An example of such parameter is the antenna type for which the range of values is limited by the known types of antennas in real life.The main effect of using a wide range of values per parameter is the loss of research focus.An extremely wide range of scenarios leads to difficulty in analyzing the results of the simulator.The large range of parameters values indicates the unawareness of the authors of the scenarios for which the proposed technique was developed.

The Suitability of Values Chosen
Among the choices made, there are two types of constraints that should be applied in any simulated research; the correlations between the parameters and the suitability of these parameters for use in real life scenarios.For example, taking two of the top three most common parameters used, Number of nodes and Network Area, the relation is never clearly stated as a justification.As seen in Figure 3, more than 50% of the surveyed papers use a ratio of 10% or less.This means that the density of the network has to be justified by the transmission range for example.Moreover, the figure also shows some non-realistic values where some scenarios have a ratio of 250%.It can also be seen that approximately 25% of the papers studied have not considered the relation at the first place.This unplanned choice of values chosen shows the weakness of the performance prediction abilities about the functionality of the proposed mechanisms.worth mentioning that the second place is filled with research papers that never mentioned the simulator choice at all.Choosing the simulation tool is a step that is affected by many reasons [5].For instance, the cost of any simulator can remove it from the list.Moreover, some researchers tend to use some none common simulators or even self-developed simulators.Although this can be caused by lack of understanding of the common simulators, sometimes it might be triggered by the fact that the researcher needs to simulate specific small-scale improvements.Moreover, NS has a wide spread due to the facts that it is an open source and free simulator.Many researches are held in institutes where financial support is a major issue.

Conclusions
In this paper, we have surveyed a number of research papers, the results show that although many of the pitfalls reported here are reported in earlier surveys, they still exist.Moreover, it is clear that the experiment setup is still not at an optimal level.The reasons of such lack of optimality need further study.However, the lack of knowledge about the exact set of parameters and their values is clear in the findings of this survey.
This work shows that the nature of MANETs is still not fully explored.The exact list of effective parameters is not fully explored leading to tendency to simulate large number of scenarios to compensate the quality with quantity.In addition, the relations between the simulation environment parameters are not clear.This leads to generating multiple scenarios to assure the coverage of these ambiguous relations.

Figure 1 .
Figure 1.Occurrences of the parameter list size.

Figure 4 Figure 3 .
Figure4presents the percentages of usage for the major simulators.The most common Simulator is NS2.It is

Table 1 .
Different values per parameter.