An Essay on the Prerequisites for the Probability Theory

The probability calculus and statistics as well permeate nearly every discipline and professional sector, while no theories underpinning this wide spreading field reached universal consensus so far. The probability interpretations present irreconcilable traits, so the concept of probability is still substantially unclear. Purpose of this work: The present paper intends to demonstrate how the different models of probability constitute the facial problem which conceals another hidden and more fundamental question. Method: We show how authors do not agree with the concept of probability P and moreover they have different ideas about the precise object qualified by P, which has priority from the point of logic. It is clear how the element X measured by P(X) influences its meaning. In consequence of the conflicting opinions, theorists tend toward a compromise. They use the outcome or result of an experiment as the argument X of P(X) and represent X as a subset of the event space. This paper suggests replacing the outcome-subset with the event-triad E, which provides a comprehensive mathematical support. Results: The last section shows how the triadic model is formally consistent with the conventional theories and can integrate the conflicting views on probability. This unifying result can help mathematicians to go beyond the present theoretical deadlock. In sum-mary, this paper advocates a more explicit notation system for probability and points out how probability can be ambiguous without rigorous specification of the sample space and the experiment in general.


Introduction
Blaise Pascal claimed that probability could give substance to a new and original mathematical field, thenceforth various interesting theories have been put forward, but none reached general agreement. The basic equations of the probability calculus prove to be agile instruments, yet thinkers face significant foundational issues. Statistics and probability look like computer science. Software programmers-even young developers-implement original applications for mobile phones, big data etc., though the principles of computing have not been firmly established. In an analogous manner, experts are able to solve several problems using effective equations, but the foundations of the probability and statistics remain rather obscure [1].
Modern works revolve about two principal references: (A) Objective theories focus on something physical and independent of the people's mind.
(B) Epistemic theories refer to the human knowledge and gained information.
They can be subdivided in the group revolving about deductive reasoning (B1), and the group about the concept of credence (B2).
Let us recall the main traits of these conceptualizations, which will be used later.
(A) Von Mises, Reichenbach and others claim that it is possible to speak about probability only in reference to a collective [2]. The collective is an unlimited sequence of repeated events, which satisfies the axioms of global regularity and local irregularity. From the frequentist perspective the probability of a sole case is nonsense. For example, the "probability of death", when it refers to a single person, has no meaning at all.
Popper is a philosopher who assumes the viewpoint of physicists as long as he was motivated by the desire to support quantum mechanics. He locates probability in the material world rather than in the human mind or in logical abstractions. Popper thinks probability as a physical tendency of random events [3].
(B1) For Keynes, probability is a rational relation that links the hypothesis to the conclusion [4]. The inferential model refers to a judgment and not to a fact.
Keynes holds that if you flip a coin, the probability of the result heads depends on your cognition and information you are possessing.
Carnap understands probability as qualifying the logical relationship between two statements [5] and, precisely, as the degree of confirmation of a conclusion on the basis of evidence. For example, paleo-archeologists deduce the likely patterns of primitives' behaviors from discovered artifacts and human remains.
(B2) De Finetti and Ramsey are recognized as the founders of subjective school. For them probability is the degree of personal belief on the occurrence of a future event. Obviously, one must esteem probability adopting coherent rules and thus Ramsey [6] and De Finetti [7] devised the betting criterion. They fix probability in terms of the rates at which an individual wagers money on the future event. Probability would not necessarily be the same for another bettor with dif-ferent knowledge, yet any bettor must follow a rule of coherence.
The Bayesians too perceive probability as a reasonable expectation reflecting personal credence. The Bayesian statistics centers on updating the probability for a hypothesis as more evidence and information becomes available. In particular, prior data produces the posterior probability distribution, which is the conditional distribution about an uncertain event given the data. In a subsequent stage the posterior distribution can be used as prior data of further inquiries. Bayes' theorem is the cornerstone of this methodology.
The present paper begins with the analysis of the conflicting viewpoints mentioned above and has the goal of going beyond these discording stances. This duty turns out to be very demanding to fulfill and we confine ourselves to the prerequisite of a comprehensive probability theory.

What Is Chance?
Thinkers disagree about the concept ( ) P X and what is even worse, they dissent about the precise object X qualified by P. That is to say, different probability interpretations constitute the facial question, which conceals another hidden and more fundamental problem [8]. The theories clash about what should be measured as the authors put forward the following different specifications: for Laplace probability P calculates favorable and total cases, for von Mises P assesses a long sequel of results, for De Finetti P is a personal betting quotient that is coherent through a Dutch book, for Ramsey P is a personal betting quotient maximizing the expect utility, for Keynes P is a degree of rational belief, for Savage P is a personal credence, for Popper P is a physical tendency, for Carnap P is the empirical evidence given to a statement, for Kolmogorov P is a subset of elementary events treated as outcomes. The disparity of ideas is so great as it hinders the cooperation of researchers and professionals. The probability domain could become a Babel tower if scholars would not put closer the various positions assuming the random "event" and "result" as the generic arguments of probability. These terms have been progressively introduced as a tradeoff amongst the various schools of thought. It should serve a wide umbrella which covers the different viewpoints listed above. Although obstacles are not over but only dislocated to the problems arising from the pair event/result [9]. The literature, for example, claims: "An event is a set of outcomes" [10] [11] and also: "An outcome is an elementary or atomic or simple event". These expressions show that probability theorists consider the words "event" and "outcome" as synonyms. Consequently, the two terms should signify equivalent objects and circumstances.
This jargon disagrees with science understanding. Operational research, computer science, environ-mental science and other disciplines use the words "event" and "outcome" to denote two distinct entities. The first refers to a fact or a phenomenon of any complex occurrence, while the second is the output or conclusion of the event. The latter denotes the end-product of the former. It may be said that the event E is the entire process while the outcome e is the ending part of E . The two nouns stand for entities, which are connected but have different nature and essence. For example, the following items are defined as events: 1) Flipping a coin.
3) Picking a card from a deck.
These events bring forth the following outcomes in the order: 1) Heads or tails.
2) A number in the interval from 1 to 6.
3) A card ranking in diamonds, spades, hearts, or clubs.
We state to reject the imprecise jargon of probabilists to adopt the precise terminology. In this paper "event" will always signify an occurrence, and "outcome" will methodically say the result or output of the event.
Anyway, problems are not over.

What Exactly Does Probability Calculate?
Theorists normally refer probability to the random outcome treating it as a function of this kind ( ) The next three sections verify whether this is true.

The Classical Equation
As first, we check how and when (1) is consistent with the classical formula (2) where Q and N are the favorable and the grand total cases in the order Do the variables Q and N refer exclusively to the result? 1) The grand total N qualifies all the possible occurrences, and one reasonably concludes that the denominator regards the global event E and not e ( ) 2) The number of the favorable cases refers to the specific outcome e that pertains to E , hence the numerator is to be written this way 3) Equiprobability is the requisite to (2) and none can assess whether the equiprobability condition is true or false based on the single result. This requisite implies the control of the overall phenomenon, and a mathematician must make the complete inventory of the experiment E . Advances in Pure Mathematics Putting (3) and (4) in (2) we obtain the complete argument of probability The object calculated by (2) is the overall event E emitting its proper outcome e. Equation (5) towers as the fundamental tool to obtain probability while the summation and the multiplication rules supply the probability of probabilities. Equation (5) proves how (1) is not false but incomplete and this remark complies with Kolmogorov who supposed the possible incompleteness of his theory [12].

Paradoxes
The second stage of our analysis goes through some problems that seem paradoxical.
I) Joseph Bertrand posed the following query [13]: "Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at ran-dom. What is the probability that the chord is longer than a side of the triangle?" The problem can be solved by three methods that provide different values.
Past and con-temporary authors (cf., for example, [14]) hold that the diverging solutions depend on the adopted "strategy", or more precisely, each value of P depends on the ways the chord is placed inside the circle. All this is inacceptable if (1) is true because the problem defines a precise outcome to calculate and thus one should obtain only one value. On the other hand, assuming (5) In summary, definition (5) formally underpins the calculus of the proper probability values (6) and (7). II) Martin Gardner put forward this problem [16]: "Mr. Smith has two children. Question # 1: The older child is a girl. What is the probability that both children are girls?
Question # 2: At least one of them is a girl. What is the probability that both children are girls?" In order of age, the two children could be P. Rocchi In question #1, only the first two cases of the list are allowed: (G, G), (G, B). Assuming each case equally likely, the probability of both children being girls is 1/2 for question #1 and 1/3 for question #2 that applies to all three cases. The "boy and girl paradox" does not need demanding calculations if it were not for the problem statement which presents a confusing context. When most people are presented with the first question, they misinterpret it as being the second question. In addition, some details have been disregarded e.g. children's genders are equally likely? There are twins? May a child be a gay?
Gardner's problem shows how probability depends on the event space and one obtains different answers if the one changes the event space.
The model (1) cannot explain the cases I) and II) since it misses the event E . It seems reasonable to conclude that a comprehensive theory has to include the event determination besides the final result.

Distributions
The probability distribution associates a value of P with each observable mode of the random variable. The distribution function specifies the relative likelihoods of all possible outcomes and qualifies the global phenomenon E under observation ( ) In the development of the function ( ) f e E , two classical conditions must be satisfied: 1) Probability must be nonnegative for each value of the random variable, 2) The sum of the probabilities must equal one.
Constraints (1) and (2) regard the global situation E and demonstrate how the probability distribution spreads through the entire domain of the variable; it gives the account of the intended phenomenon E . Two example cases mean to clarify this concept.
Example: A couple plans to have 3 children. Suppose the children's gender is equally like. Then the following Table 1 exhibits the probability distribution of the genders.
Each value of P refers to a single combination, while the table describes the family with its potential children, which is the exact argument of the probability distribution. Example: Gamma distribution is often adopted to fit the rain rate x of the weather W in a certain area where α and β are the shape and scale parameters, and Γ is the usual gamma function. Daily precipitation rates over the entire terrestrial globe are available since decades. Series of 3-monthly, 6-monthly, 12-monthly and 24-monthly averaged precipitation are built for any local geographical domain, and long-term recorded data is fitted to a probability distribution. Notably, the gamma distribution de-scribes the meteorological situation W in accordance to (6).

Conclusion of the First Part
In conclusion, Sections 3.1, 3.2 and 3.3 demonstrate in the order:  The classical formula calculates the random event with its proper result.  The problem statement which does not specify properly the random event turn out to be ambiguous even when the expected outcome is clearly described.  The probability distribution depicts the overall event through the spectrum of single results. We reasonably conclude that is not wrong, but incomplete. Definition (1) does not hamper the calculus of applications as long as verbal annotations describing the intended phenomenon back practitioners. For example, it is sufficient to mention the specific game of chance and the professionals become able to calculate equation (2) even if mathematicians provided incomplete definitions.
Theorists cannot play the same trick used by professionals. Theorists cannot use verbal comments to override the limits of (1) since they progress through a rigorous inferential process from the primitive notions onward. If a premise is misleading or incomplete, it hampers the logic development of the theory. Definition (1) suffices to support numerical calculations but cannot underpin theoretical conclusions since it prevents deductive reasoning.
It is important to underline how theorizing is not the same as calculating applications, the latter seek mathematical-numerical results, the former pursue mathematical-logical explanations through deduction. As an example, let us examine the case we discussed in Section 2.2. Definition (5) enables us to fix the events 1 2 3 4 , , , E E E E and to conclude that the Bertrand problem has four distinct solutions formalized by (6) and (7). Modern authors define probability by means of the partial argument (1) and recognize that the problem remains unsolved [17] since the four solutions contradict the assumption (1). Conventional theories fail and call I and II as "paradoxes" because they are unable to give rational explanations.
We cannot extensively discuss how theories are much more demanding re-spect to the applied calculus from the logical viewpoint. This topic goes beyond the scopes of the present paper and we confine ourselves to mention the illuminating contributions of [18] [19] [20] besides the classical work of Kuhn [21].
In conclusion, the argument (E , e) thoroughly makes explicit the concept of probability. In point of logic, the concept of event (with its proper outcome) comes first and the notion of probability second; hence the former is the prerequisite of theoretical inquiries on which the present work focuses.

The Event Is the Primitive Notion
The generic concept of event turns out to be self-explanatory, and this perfectly fits with the rule that any mathematical theory must be grounded on intuitive notions. Dictionaries show this mandatory quality such as the following entry Definition 1: An event is something that happens or might happen.
[source: Vocabulary.com] The event is an occurrence of any kind: material or mental, simple or intricate, placed in the past, to-day or in the future, made by people or by inanimate elements etc. The reader could object this spontaneous idea might offer a generic view. Often, intuitive verbal expressions may lack the precision necessary to set up a scientific construction. Hence, we start with the self-explanatory entry (11) and conduct a conceptual analysis, which will yield the formal model suitable for the probabilistic perspective. In this manner, the description of the object qualified by the probability P will be thoroughly developed.
Expression (11) presents an event as something that takes place and will pass, no matter it is regular or occasional, speedy or slow, certain or impossible etc. An event occurs in a time scale and centers on the state of some system, practical or abstract. Usually events entail a passage marked by the starting point and the conclusion. An event begins with an antecedent and closes with a consequent that is the result or outcome. The pair antecedent/consequent can be the input/output or the initial/final state of something or somebody, etc. The change placed between the first term and the final term is the core of the event and is usually called process, operation and so forth. In accordance to the entry (8), we conclude that an event is a dynamical occurrence equipped with three principal components: 1) The antecedent, 2) The consequent (result or outcome), 3) The process which relates the first to the second. The elements 1), 2) and 3) regulate understanding of the event from the viewpoint that conforms to the probability logic focusing on the outcomes. In addition, the elements 1), 2) and 3) also fit with a broad assortment of studies driven in operational research, management science, cybernetics, electronics, computer science, etc. We mention the input-process-output (IPO) paradigm introduced in electrical engineering with the Mealy and Moore models around the mid-fifties. Next IPO scheme migrated into software engineering and later expanded into an assortment of contexts including psychology [22], education [23], industry [24], computer science [25], biology, environmental science etc.
The elements 1), 2) and 3) suggest adopting a triad or triadic structure to formalize the event in general [26]. A triad is not a triple since a triple is any set with three elements, while a triad is a system of three connected elements or components. To elaborate the mathematical model of the event, we introduce the mathematical structure called fundamental triad or named set.
Definition 2: The basic fundamental triad (or basic named set) X is:

( )
, , X f N = X (12) where:  X is the support of X denoted by S(X),  N is the component of names (reflector) or set of names of X denoted by N(X),  f is the naming correspondence of X.
An example is a set of people X, when N is a set of their names and f is the correspondence connecting people and their names. Any ordinary set is also a special case of named sets, namely, it is a single-named set in which all elements have the same name. Figure 1 visualizes the basic fundamental triad.
Many structures in mathematics are special cases of named sets [27] such as functions, bi-nary relations, graphs and hypergraphs, homomorphisms, operators, vectors, tensor fields, homeomorphisms, fuzzy sets and multisets, morphisms and functors in categories [28], Boolean valued sets [29] Mark Burgin started his inquiries in 1982 and developed Named Set Theory as the unified foundations for mathematics [30] with applications in numerous areas [26].

Use of the Event Triad
The basic fundamental triad complies with the remarks of Sections 2 and 3, and we use E to represent the event in formal terms The event triad E is consistent with (1) because e is a subset in (13). The event triad fits with the structural analysis developed in Section 3.1, in detail the intermediate component p (point 3) processes the initial state or conditions i (point 1) into the result or outcome e (point 2). By intuition, the component p links i with e, and the event E comes into being; namely the components detail the dynamical nature of the event which we have described in Section 3 by words.
The intuitive definition (11) holds that the event has the property of occurring, in accordance to the literature we establish that probability is the parameter that qualify this property:   (14) Expression (14) asserts that probability calculates the overall occurrence together with its result e; it complies with (5) and goes beyond the simplified definition (1). Probability is a normalized quantity Using Definition 3, we conclude that, once established the preliminary conditions i, E can occur always, never or randomly. For example, when p "systematically" connects the antecedent with the result, we have the certain event whose probability is the unit. Take this case: "An urn contains 10 red marbles; a marble is drawn at random and is red". The triad model shows the antecedent consisting of the urn with the red marbles, and the drawing process that brings forth systematically the result and probability equals the unit E = (Urn with 10 red marbles, Extraction, One red marble) A syllogism is a kind of logical deduction that starts with two or more propositions assumed to be true and arrives to a certain conclusion. The triad makes explicit the epistemic event such as the ensuing case E = ("All men are mortal" and "Greeks are men", Logical deduction, "All Greeks are mortal") If the elements i and p do not establish surely the result, the event E is aleatory. For example: "An urn contains five red marbles and five black; a marble is drawn at random and is red" E = (Urn with 5 red marbles and 5 black marbles, Extraction, One red marble) The extraction does not supply systematically the output, the event is random and the probability decimal.
The graph associated to (13) can be used in a natural manner to visualize the dynamic essence of the event Conventional theories focus on the outcome e that is a subset, in turn the set theory underpins Venn's diagram and not graphs. The authors often adopt the graphs on intuitive basis for reason of convenience. Here the graph of the event triad ( Figure 2) perfectly fits with the theory and does show incongruities. As an example, take the case where two subsequent extractions of black and white balls occur from two urns (Figure 3).
Let us examine the triadic model in relation to the interpretations of probability mentioned in Section 1.  where ( ) nE is a set of physical and repeated events called collective.
(B1) The basic fundamental triad presents concise insights of epistemic theories. It makes explicit Keynes' thought who holds probability qualifies the strength of logical relationship. The rational inference p links the hypothesis i to the conclusion, (Figure 4) and this process makes the mental event happen. In principle the stronger is the rational inference and the higher P(E R ).
(B2) Subjectivists and Bayesians hold that probability is the degree of personal belief on the content expressed by one or more propositions. The individual acquires a certain amount of information and the "belief" is the mental process which places trust in the final sentence ( Figure 6). The acceptance of the sentence may be more or less strong, correspondly the probability may be more or less high.
The triad E B formalizes the epistemic event calculated with the Bayesian methods.
Expressions (16)- (22) prove how the triadic model is able to cover probabilities exhibiting far different features; the structure E offers a unifying perspective and paves the way toward a comprehensive theory of probability. The model E    enables a consistent view whereas conventional constructions-grounded on the result e-sets the various probability models in opposition, so the probability foundation is still an unresolved issue.
Here we confine ourselves to present E as the thorough model of the object qualified by P(E). We highlight how expressions (16)- (22) do not exhaust the study of the probabilities which have different properties and require further insights, especially the Bayesian probability. The target of the present paper is the analysis of the prerequisites and further discussions about the probability theory go beyond our objectives.

Conclusions
This research advocates a more explicit notation system for probability and points out how probability can be ambiguous without rigorous specification of the sample space and the experiment in general. In fact, the literature shows how theorists disagree about what the probability does qualify, besides the diverging interpretations of P. Eminent authors present nearly dozen different ideas. If probabilists do not fix what they are calculating, the entire sector risks of becoming a Babel tower, thus researchers have found the notion of event/result as a kind of compromise which the various schools share. We have shown how the pair event/result leads to non-trivial inaccuracies and raises significant objections and paradoxical conclusions. The prerequisites of conventional theories are wanting.
The present paper suggests the use of the fundamental triad as the mathematical model of the event in the place of the result formalized by a subset. We have shown how E provides the formal illustration of the concepts presented by von Mises, Keynes, Carnap and others. The fundamental triad is open to the probabilities pertaining to the various schools: logical, subjective, frequentist etc., namely it integrates different perspectives, which are usually credited as impossible to reconcile. This unifying result can help mathematicians to go beyond the present theoretical deadlock. This paper focuses on the prerequisites of probability theory but does not go through the intrinsic properties of P(E B ), P(E L ), P(nE) etc. that exceed the scopes of the present work.

Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.