_{1}

^{*}

**Objective:** Epstein-Barr virus (EBV), a herpes virus which persists in memory B cells in the peripheral blood for the lifetime of a person, is accused to be associated with several malignancies. Hodgkin’s lymphoma (HL) has long been suspected to have an Epstein-Barr virus infection as a causal agent. Some recent studies identified an EBV latent infection to a high degree in Hodgkin’s lymphoma. However, despite intensive study, the role of Epstein-Barr virus infection in Hodgkin lymphoma remains enigmatic.
**Methods:** To explore the cause-effect relationship between EBV and HL and so to understand the role of EBV in HL etiology more clearly, a systematic review and re-analysis of studies published is performed. The method of the conditio per quam relationship was used to proof the hypothesis if Epstein-Barr virus infection (DNA) in human lymph nodes is present then Hodgkin lymphoma is present too. The mathematical formula of the causal relationship k was used to proof the hypothesis, whether there is a cause effect relationship between an Epstein-Barr virus infection (EBV DNA) and Hodgkin lymphoma. Significance was indicated by a p-value of less than 0.05.
**Result:** The data analyzed support the Null-hypotheses that if Epstein-Barr virus infection (EBV DNA) is present in human lymph nodes then Hodgkin lymphoma is present too. In the same respect, the studies analyzed provide highly significant evidence that Epstein-Barr virus the cause of Hodgkin lymphoma.
**Conclusion:** The findings of this study suggest that Epstein-Barr virus is the cause of Hodgkin’s lymphoma besides of the complexity of Hodgkin’s disease.

In 1964, Epstein [

Hodgkin lymphoma (HL) is a deadly disease too. Identifying the cause of Hodgkin’s lymphoma has the potential to spare a lot of lives.

Dinand et al. [

Novel and modern laboratory techniques [

Hodgkin’s lymphoma | Total | |||
---|---|---|---|---|

yes | no | |||

EBV DNA (ISH) | yes | 126 | 0 | 126 |

no | 9 | 25 | 34 | |

Total | 135 | 25 | 160 |

ISH (FISH), RNA in situ hybridization (RNA ISH), Polymerase chain reaction (PCR), Nested PCR, Quantitative polymerase chain reaction (QPCR) have fueled us to change our understanding of the pathogenesis of cancer development. Immunohistochemistry (IHC), introduced by Coons [

Veronique Dinand et al. [

All statistical analyses were performed with Microsoft Excel version 14.0.7166.5000 (32-Bit) software (Microsoft GmbH, Munich, Germany).

Among some discrete distributions like the hypergeometric distribution, the

Hodgkin’s lymphoma | Total | |||
---|---|---|---|---|

yes | no | |||

EBV DNA | yes | 19 | 0 | 19 |

no | 11 | 70 | 81 | |

Total | 30 | 70 | 100 |

Poisson distribution et cetera the binomial distribution is of special interest. Sometimes, the binomial distribution is called the Bernoulli distribution in honor of the Swiss mathematician Jakob Bernoulli (1654 - 1705), who derived the same. Bernoulli trials are an essential part of the Bernoulli distribution. Thus far, let us assume two fair coins named as _{0}W_{t} and as _{R}U_{t}. In our model, heads of such a coin are considered as success T (i.e. true) and labeled as +1 while tails may be considered as failure F (i.e. false) and are labeled as +0. Such a coin is called a Bernoulli-Boole coin. The probability of success of _{R}U_{t} at one single Bernoulli trial t is denoted as

p ( U R t = + 1 ) ≡ p ( U R t ) (1)

The probability of failure of _{R}U_{t} at one single Bernoulli trial t is denoted as

p ( U R t = + 0 ) ≡ p ( U _ R t ) ≡ 1 − p ( U R t ) (2)

Furthermore, no matter how many times an experiment is repeated, let the probability of a head or the tail remain the same. The trials are independent which implies that no matter how many times an experiment is repeated, the probability of a single event at a single trial remain the same. Repeated independent trials which are determined by the characteristic that there are always only two possible outcomes, either +1 or +0 and that the probability of an event (outcome) remain the same at each single trial for all trials are called Bernoulli trials. The definition of Bernoulli trials provides a theoretical model which is of further use. However, in many practical applications, we may by confronted by circumstances which may be considered as approximately satisfying Bernoulli trials. Thus far, let us perform an experiment of tossing two fair coins simultaneously. Suppose two fair coins are tossed twice. Then there are 2^{2} = 4 possible outcomes (the sample space), which may be shown as

( [ U R t = + 1 ] , [ W 0 t = + 1 ] ) , ( [ U R t = + 1 ] , [ W _ 0 t = + 0 ] ) , ( [ U _ R t = + 0 , W 0 t = + 1 ] ) , ( [ U _ R t = + 0 , W _ 0 t = + 0 ] )

This may also be shown as a 2-dimensional sample space in the form of a contingency table (

In the following, the contingency table is defined more precisely (

In general it is ( a + c ) = W 0 t , ( a + b ) = U R t , ( c + d ) = W _ 0 t , ( b + d ) = U _ R t and a + b + c + d = N = W R t . Equally, it is W 0 t + W _ 0 t = U R t + U _ R t = W R t = N . Thus far, if one fair coin is tossed n times, we have n repeated Bernoulli trials

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes =+1 | ([_{R}U_{t} = +1], [_{0}W_{t} = +1]) | ([_{R}U_{t} = +1], [_{0}W_{t} = +0]) | _{R}U_{t} |

No = +0 | ([_{R}U_{t} = +0], [_{0}W_{t} = +1]) | ([_{R}U_{t} = +0], [_{0}W_{t} = +0]) | _{R}U_{t} | |

Total | _{0}W_{t} | _{0}W_{t} | _{R}W_{t} |

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes = +1 | a | b | _{R}U_{t} |

No = +0 | c | d | _{R}U_{t} | |

Total | _{0}W_{t} | _{0}W_{t} | N = _{R}W_{t} |

and an n dimensional sample space with 2^{n} sample points is generated. In general, when given n Bernoulli trials with k successes, the probability to obtain exactly k successes in n Bernoulli trials is given by

p ( k ) = ( n k ) × p ( U R t = + 1 ) k × ( 1 − p ( U R t = + 1 ) ) n − k (3)

The random variable k is sometimes called a binomial variable. The probability to obtain k events or more (at least k events) in n trials is calculated as

p ( k ≥ X ) = p ( k = X ) + p ( k > X ) = ∑ k = X k = n ( ( n k ) × p ( U R t = + 1 ) k × ( 1 − p ( U R t = + 1 ) ) n − k ) (4)

The probability to obtain less than k events in n Bernoulli trials is calculated as

p ( k < X ) = 1 − p ( k ≥ X ) = 1 − ∑ k = X k = n ( ( n k ) × p ( U R t = + 1 ) k × ( 1 − p ( U R t = + 1 ) ) n − k ) (5)

The formula of the conditio per quam [

p ( EBV DNA → Hodgkin ’ slymphoma ) ≡ a + c + d N (6)

and used to proof the hypothesis: if presence of EBV infection (EBV DNA) then presence of Hodgkin’s lymphoma.

The formula of the conditio per quam [

p ( EBV DNA ← Hodgkin ’ slymphoma ) ≡ a + b + d N (7)

and used to proof the hypothesis: without presence of EBV infection (EBV DNA) no presence of Hodgkin’s lymphoma.

The necessary and sufficient condition relationship was defined [

p ( EBV DNA ← → Hodgkin ’ slymphoma ) ≡ a + d N (8)

Scholium.

Historically, the notion sufficient condition is known since thousands of years. Many authors testified original contributions of the notion material implication only for Diodorus Cronus. Still, Philo the Logician (~300 BC), a member of a group of early Hellenistic philosophers (the Dialectical school), is the main forerunner of the notion material implication and has made some groundbreaking contributions [

In contrast to such a point of view, the opposite point of view is correct too. Thus far, there is a straightforward way to give a precise and comprehensive account of the meaning of the term necessary or sufficient condition itself. In other words, if fire is present then oxygen is present too.

Fire | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Oxygen | Yes = +1 | a | b | _{R}U_{t} |

No = +0 | 0 | d | _{R}U_{t} | |

Total | _{0}W_{t} | _{0}W_{t} | N = _{R}W_{t} |

Oxygen | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Fire | Yes = +1 | a | 0 | _{R}U_{t} |

No = +0 | c | d | _{R}U_{t} | |

Total | _{0}W_{t} | _{0}W_{t} | N = _{R}W_{t} |

Especially, necessary and sufficient conditions are converses of each other. Still, the fire is not the cause of oxygen and vice versa. Oxygen is note the cause of fire. In this example before, oxygen is a necessary condition, a conditio sine qua non, of fire. A necessary condition is sometimes also called “an essential condition” or a conditio sine qua non. In propositional logic, a necessary condition, a condition sine qua non, is generally symbolized as “p ← q” or in spoken language “without p no q”. Both q and p are statements, with p the antecedent and q the consequent. To show that p is not a necessary condition for q, it is necessary to find an event or circumstances where q is present (i.e. an illness) but p (i.e. a risk factor) is not. On any view, (classical) logic has as one of its goals to characterize the most basic, the most simple and the most general laws of objective reality. Especially, in classical logic, the notions of necessary conditions, of sufficient conditions of necessary and sufficient conditions et cetera are defined very precisely for a single event, for a single Bernoulli trial t. In point of fact, no matter how many times an experiment is repeated, the relationship of the conditio sine qua or of the conditio per quam which is defined for every single event will remain the same. Under conditions of independent trials this implies that no matter how many times an experiment is repeated, the probability of the conditio sine qua or of the conditio per quam of a single event at a single trial t remain the same which transfers the relationship of the conditio sine qua or of the conditio per quam et cetera into the sphere of (Bio-) statistics. Consequently, (Bio) statistics generalizes the notions of a sufficient or of a necessary condition from one single Bernoulli trial to N Bernoulli trials. However, in many practical applications, we may by confronted by circumstances which may be considered as approximately satisfying the notions of a sufficient or of a necessary condition. Thus far, under these circumstances, we will need to perform some tests to investigate, can we rely on our investigation.

Many times, for some reason or other it is not possible to study exhaustively a whole population. Still, sometimes it is possible to draw a sample from such a population which itself can be studied in detail and used to convince us about the properties of the population. Roughly speaking, statistical inference derived from a randomly selected subset of a population (a sample) can lead to erroneous results. The question raised is how to deal with the uncertainty inherent in such results? The concept of confidence intervals, closely related to statistical significance testing, was formulated to provide an answer to this problem.

Confidence intervals, introduced to statistics by Jerzy Neyman in a paper published in 1937 [

p C r i t = p C a l c ± ( z Alpha / 2 × ( 1 N × p C a l c × ( 1 − p C a l c ) 2 ) ) (9)

where p_{Calc} is the sample proportion of successes in a Bernoulli trial process with N trials yielding X successes and N-X failures and z is i.e. the 1 − (Alpha/2) quantile of a standard normal distribution corresponding to the significance level alpha. For example, for a 95% confidence level alpha = 0.05 and z is z = 1.96. A very common technique for calculating binomial confidence intervals was published by Clopper-Pearson [

Furthermore, an approximate and conservative (one sided) confidence interval was developed by Louis [_{U} denote the upper limit of the one-sided 100 × (1 − α)% confidence interval for the unknown proportion when in N independent trials no events occur [_{U} is the value such that

π U = ( − ln ( α ) n ) ≈ ( 3 n ) (10)

assuming that α = 0.05. In other words, an one-sided approximate upper 95% confidence bound for the true binomial population proportion π, the rate of occurrences in the population, based on a sample of size n where no successes are observed (p = 0) is 3/n [

Under conditions where a certain event did not occur [

confidence interval for the binomial parameter for the rate of occurrences in the population.

Another special case of the binomial distribution is based on a sample of size n where only successes are observed (p = 1). Accordingly, the lower limit of a one-sided 100 × (1 − α)% confidence interval for a binomial probability π_{L}, the rate of occurrences in the population, based on a sample of size n where only successes are observed is given approximately by [(1− (−ln(α)/n)) < π < +1] or (assuming α = 0.05)

π L = 1 − ( − ln ( α ) n ) ≈ 1 − ( 3 n ) (11)

To construct a two-sided 100 × (1 − (α))% interval according to the rule of three, it is necessary to take a one-sided 100 × (1 − (α/2))% confidence interval. In this study, we will use the rule of three [

0 | 1 | ||||||||
---|---|---|---|---|---|---|---|---|---|

p = 0 | |||||||||

π_{U} | |||||||||

0 | −ln(α)/n | n | |||||||

0 | +1 | ||||||||
---|---|---|---|---|---|---|---|---|---|

p = 1 | |||||||||

π_{L} | |||||||||

0 | 1 − (−ln(α)/n) | n | |||||||

A test statistics of independent and more or less normally distributed data which follow a chi-squared distribution is valid as with many statistical tests due to the central limit theorem. Especially, with large samples, a chi-squared distribution can be used. A sample is considered as large when the sample size n is n = 30 or more. With a small sample (n < 30), the central limit theorem does not apply and erroneous results could potentially be obtained from the few observations if the same is applied. Thus far, when the number of observations obtained from a population is too small, a more appropriate test for of analysis of categorical data i.e. contingency tables is R. A. Fisher’s exact test [

The hypergeometric distribution, illustrated in a table (_{0}W_{t}, without replacement, from a finite population of the size N which contains exactly _{R}U_{t} objects with a certain feature while each event is either a success or a failure. The formula for the hypergeometric distribution, a discrete probability distribution, is

p ( a ) = ( U R t a ) × ( N − U R t W 0 t − a ) ( N W 0 t ) (12)

The hypergeometric distribution has a wide range of applications. The Hypergeometric distribution can be approximated by a Binomial distribution. The elements of the population being sampled are classified into one of two mutually exclusive categories: either conditio sine qua non or no conditio sine qua non relationship. We are sampling without replacement from a finite population. How probable is it to draw specific c events/successes out of _{0}W_{t} total draws from an aforementioned population of the size N? The hypergeometric distribution, as shown in a table (_{0}W_{t} − a) events out of N events.

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes = +1 | a | b =(_{R}U_{t} ? a) | _{R}U_{t} |

No = +0 | c = (_{0}W_{t} − a) | N − _{R}U_{t} − _{0}W_{t} + a | N − _{R}U_{t} | |

Total | _{0}W_{t} | N − _{0}W_{t} | N |

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

No Condition | Yes =+1 | c = (_{0}W_{t} − a) | N − _{R}U_{t} − _{0}W_{t} + a | N − _{R}U_{t} |

No = +0 | a | b = (_{R}U_{t} − a) | _{R}U_{t} | |

Total | _{0}W_{t} | N - _{0}W_{t} | N |

A statistical hypothesis test is a method to extract some inferences from data. A hypothesis is compared as an alternative hypothesis. Under which conditions does the outcomes of a study lead to a rejection of the null hypothesis for a pre-specified level of significance. According to the rules of a proof by contradiction, a null hypothesis (H_{0}) is a statement which one seeks to disproof. The related specific alternative hypothesis (H_{A}) is opposed to the null hypothesis such that if null hypothesis (H_{0}) is true, the alternative hypothesis (H_{A}) is false and vice versa. If the alternative hypothesis (HA) is true then the null hypothesis (H_{0}) is false. In principle, a null hypothesis that is true can be rejected (type I error) which lead us to falsely infer the existence of something which is not given. The significance level, also denoted as α (alpha) is the probability of rejecting a null hypothesis when the same is true. A type II error is given, if we falsely infer the absence of something which in reality is given. A null hypothesis can be false but a statistical test may fail to reject such a false null hypothesis. The probability of accepting a null hypothesis when the same is false (type II error), is denoted by the Greek letter β (beta) and related to the power of a test (which equals 1 − β). The power of a test indicates the probability by which the test correctly rejects the null hypothesis (H_{0}) when a specific alternative hypothesis (H_{A}) is true. Most investigator assess the power of a tests using 1 − β = 0.80 as a standard for adequacy. A tabularized relation between truth/falseness of the null hypothesis and outcomes of the test are shown precisely within a table (

In general, it is 1 − α + α = 1 or (1 − α − β) + α = 1− β.

The mathematical formula of the causal relationship k [

k ( U R t , W 0 t ) ≡ ( ( N × a ) − ( U R t × W 0 t ) ) ( U R t × U _ R t ) × ( W 0 × W _ 0 t ) 2 (13)

and the chi-square distribution [

Null Hypothesis (H_{0}) is | Total | |||
---|---|---|---|---|

True | False | |||

Null Hypothesis (H_{0}) | Accepted | 1 − α | β | 1 −α + β |

Rejected | α | 1 − β | 1 +α − β | |

Total | 1 | 1 | 2 |

ties a cause and its own effect together? Is there a necessary connection between a cause and effect at all? Theoretically, it is neither justified nor necessary to reduce causation as such to an act of observation or measurement. Sill, case-control studies, experiments, observations et cetera can help us to recognize cause effect relationships. In this context it is necessary to stress out that every single event (effect) has its own cause, which is the logical foundation of the mathematical formula of the causal relationship k. It is therefore entirely clear that this is the fundamental difference to Pearson’s methodological approach. Obviously, although under some certain specified circumstances Pearson’s product-moment correlation coefficient [

The chi-squared distribution [

A chi-square goodness of fit test can be applied to determine whether sample data are consistent with a hypothesized distribution. The chi-square goodness of fit test is appropriate when some conditions are met. A view of these conditions are simple random sampling, categorical variables and an expected value of the number of sample observations which is at least 5. The null hypothesis (H_{0}) and its own alternative hypothesis (H_{A}) are stated in such a way that they are mutually exclusive. In point of fact, if the null hypothesis (H_{0}) is true, the other, alternative hypothesis (H_{A}), must be false; and vice versa. For a chi-square goodness of fit test, the hypotheses can take the following form.

p-Value | One sided X^{2} | Two sided X^{2} | |
---|---|---|---|

The chi square distribution | 0.1000000000 | 1.642374415 | 2.705543454 |

0.0500000000 | 2.705543454 | 3.841458821 | |

0.0400000000 | 3.06490172 | 4.217884588 | |

0.0300000000 | 3.537384596 | 4.709292247 | |

0.0200000000 | 4.217884588 | 5.411894431 | |

0.0100000000 | 5.411894431 | 6.634896601 | |

0.0010000000 | 9.549535706 | 10.82756617 | |

0.0001000000 | 13.83108362 | 15.13670523 | |

0.0000100000 | 18.18929348 | 19.51142096 | |

0.0000010000 | 22.59504266 | 23.92812698 | |

0.0000001000 | 27.03311129 | 28.37398736 | |

0.0000000100 | 31.49455797 | 32.84125335 | |

0.0000000010 | 35.97368894 | 37.32489311 | |

0.0000000001 | 40.46665791 | 41.82145620 |

H_{0}: The sample distribution agrees with the hypothetical (theoretical) distribution.

H_{A}: The sample distribution does not agree with the hypothetical (theoretical) distribution.

The X^{2} Goodness-of-Fit Test can be shown schematically as

χ 2 ≡ ∑ t = + 1 t = + N ( ( Observed t − Expected t ) 2 Expected t ) (14)

The degrees of freedom are calculated as N − 1. If there is no discrepancy between an observed and a theoretical distribution, then X^{2} = 0. As the discrepancy between an observed and a theoretical distribution becomes larger, the X^{2} becomes larger. This X^{2} values are evaluated by the known X^{2} distribution.

The original X^{2} values are calculated from an original theoretical distribution, which is continuous, whereas the approximation by the X^{2} Goodness of fit test we are using is discrete. Thus far, there is a tendency to underestimate the probability, which means that the number of rejections of the null hypothesis can increase too much and must be corrected downward. Such an adjustment (Yate’s correction for continuity) is used only when there is one degree of freedom. When there is more than one degree of freedom, the same adjustment is not used. Applying this to the formula above, we find the X^{2} Goodness-of-Fit Test with continuity correction shown schematically as

χ 2 ≡ ∑ t = + 1 t = + N ( ( | Observed t − Expected t | − ( 1 2 ) ) 2 Expected t ) (15)

When the term (|Observed_{t} − Expected_{t}|) is less than 1/2, the continuity correction should be omitted.

The theoretical (hypothetical) distribution of a sufficient condition is shown schematically by the 2 × 2 table (

The theoretical distribution of a sufficient condition (conditio pre quam) is determined by the fact that b = 0. The X^{2} Goodness-of-Fit Test with continuity correction of a sufficient condition (conditio per quam) is calculated as

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes = +1 | a | b = 0 | (a + b) |

No = +0 | c | d | (c + d) | |

Total | (a + c) | (b + d) | (a + b + c + d) |

χ 2 ( IMP ) ≡ ( ( | a − ( a + b ) | − ( 1 2 ) ) 2 ( a + b ) ) + ( ( | ( c + d ) − ( c + d ) | − ( 1 2 ) ) 2 ( c + d ) ) = ( ( | a − ( a + b ) | − ( 1 2 ) ) 2 ( a + b ) ) + 0 (16)

or more simplified as

χ 2 ( IMP ) ≡ ( ( | − b | − ( 1 2 ) ) 2 ( a + b ) ) + 0 (17)

Under these circumstances, the degree of freedom is d.f. = N − 1 = 2 − 1 = 1 .

The theoretical (hypothetical) distribution of a necessary condition is shown schematically by the 2 × 2 table (

The theoretical distribution of a necessary condition (conditio sine qua non) is determined by the fact that c = 0. The X^{2} Goodness-of-Fit Test with continuity correction of a necessary condition (conditio sine qua non) is calculated as

χ 2 ( SINE ) ≡ ( ( | ( a + b ) − ( a + b ) | − ( 1 2 ) ) 2 ( a + b ) ) + ( ( | ( d ) − ( c + d ) | − ( 1 2 ) ) 2 ( c + d ) ) = 0 + ( ( | d − ( c + d ) | − ( 1 2 ) ) 2 ( c + d ) ) (18)

or more simplified as

χ 2 ( SINE ) ≡ ( ( | − c | − ( 1 2 ) ) 2 ( c + d ) ) + 0 (19)

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes = +1 | a | b | (a + b) |

No = +0 | c = 0 | d | (c + d) | |

Total | (a + c) | (b + d) | (a + b + c + d) |

Under these circumstances, the degree of freedom is d.f. = N − 1 = 2 − 1 = 1 .

The theoretical (hypothetical) distribution of a necessary and sufficient condition is shown schematically by the 2 × 2 table (

The theoretical distribution of a necessary and sufficient condition is determined by the fact that b = 0 and that c = 0. The X^{2} Goodness-of-Fit Test with continuity correction of a necessary and sufficient condition is calculated as

χ 2 ( NecessaryANDSufficient ) ≡ ( ( | ( a ) − ( a + b ) | − ( 1 2 ) ) 2 ( a + b ) ) + ( ( | ( d ) − ( c + d ) | − ( 1 2 ) ) 2 ( c + d ) ) (20)

or more simplified as

χ 2 ( NecessaryANDSufficient ) ≡ ( ( | − b | − ( 1 2 ) ) 2 ( a + b ) ) + ( ( | − c | − ( 1 2 ) ) 2 ( c + d ) ) (21)

Under these circumstances, the degree of freedom is d.f. = N − 1 = 2 − 1 = 1 .

Claims.

Null hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is a conditio sine qua non of Hodgkin’s lymphoma.

Alternative hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is not a conditio sine qua non of Hodgkin’s lymphoma.

Significance level (Alpha) below which the null hypothesis will be rejected: 0.05.

Proof.

The data of an infection by Epstein-Bar virus and Hodgkin’s lymphoma are viewed in the 2 × 2 table (^{2} Goodness-of-Fit Test with continuity

Conditioned | Total | |||
---|---|---|---|---|

Yes = +1 | No = +0 | |||

Condition | Yes = +1 | a | b = 0 | (a + b) |

No = +0 | c = 0 | d | (c + d) | |

Total | (a + c) | (b + d) | (a + b + c + d) |

correction of a necessary condition (conditio sine qua non) known to be defined as p (Epstein-Bar virus DNA ← Hodgkin’s lymphoma) is calculated as

χ 2 ( SINE ) ≡ ( ( | − c | − ( 1 2 ) ) 2 ( c + d ) ) + 0 = ( ( | − 9 | − ( 1 2 ) ) 2 ( 9 + 25 ) ) = 2.125

Under these circumstances, the degree of freedom is d.f. = N − 1 = 2 − 1 = 1 . The critical X^{2} (significance level alpha = 0.05) is known to be 3.841458821 (^{2} value = 2.125 and less than the critical X^{2} = 3.841458821. Hence, our calculated X^{2} value = 2.125 is not significant and we accept our null hypothesis. Due to this evidence, we do not reject the null hypothesis in favor of the alternative hypotheses. In other words, the sample distribution agrees with the hypothetical (theoretical) distribution. Our hypothetical distribution was the distribution of the necessary condition. Thus far, the data as published by Dinand et al. [

Q.e.d.

Claims.

Null hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is a conditio per quam of Hodgkin’s lymphoma.

(p_{0} > p_{Crit}).

Alternative hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is not a conditio per quam of Hodgkin’s lymphoma.

(p_{0} < p_{Crit}).

Significance level (Alpha) below which the null hypothesis will be rejected: 0.05.

Proof.

The data of an infection by Epstein-Bar virus and Hodgkin’s lymphoma are viewed in the 2 × 2 table (

p ( EBVDNA → Hodgkin ’ slymphoma ) = ( 126 + 9 + 25 ) 160 = 160 160 = 1

The critical value p_{Crit} (significance level alpha = 0.05) is calculated [

p C r i t = 1 − 3 160 = 0 .981276673

The critical value is p_{Crit} = 0.981276673 and is less than the proportion of successes calculated as p (Epstein-Bar virus DNA → Hodgkin’s lymphoma) = 1. Due to this evidence, we do not reject the null hypothesis in favor of the alternative hypotheses. The data as published by Dinand et al. [

Q.e.d.

Claims.

Null hypothesis: (no causal relationship)

There is no causal relationship between an infection of human lymph nodes by Epstein-Bar virus and Hodgkin’s lymphoma.

Alternative hypothesis: (causal relationship)

There is a causal relationship between an infection of human lymph nodes by Epstein-Bar virus and Hodgkin’s lymphoma.

(k <> 0).

Conditions.

Alpha level = 5%.

The two tailed critical Chi square value (degrees of freedom = 1) for alpha level 5% is 3.841458821.

Proof.

The data for this hypothesis test are illustrated in the 2 × 2 table (

k ( EBV DNA,Hodgkin ’ slymphoma ) = ( ( 160 × 126 ) − ( 126 × 135 ) ) ( 126 × 34 ) × ( 135 × 25 ) 2 = + 0 .82841687

The value of the test statistic k = +0.82841687 is equivalent to a calculated [

χ Calculated 2 = 160 × ( ( ( 160 × 126 ) − ( 126 × 135 ) ) ( 126 × 34 ) × ( 135 × 25 ) 2 ) × ( ( ( 160 × 126 ) − ( 126 × 135 ) ) ( 126 × 34 ) × ( 135 × 25 ) 2 ) χ Calculated 2 = 160 × 0 .82841687 × 0 .82841687 χ Calculated 2 = 109 .8039216

The chi-square statistic, uncorrected for continuity, is calculated as X^{2} = 109.8039216 and thus far equivalent to a P value of 0.000000000000000000000000108179. The calculated chi-square statistic exceeds the critical chi-square value of 3.841458821 (

Q.e.d.

Claims.

Null hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is a conditio sine qua non of Hodgkin’s lymphoma.

Alternative hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is not a conditio sine qua non of Hodgkin’s lymphoma.

Significance level (Alpha) below which the null hypothesis will be rejected: 0.05.

Proof.

The data of an infection by Epstein-Bar virus and Hodgkin’s lymphoma are viewed in the 2 × 2 table (^{2} Goodness-of-Fit Test with continuity correction of a necessary condition (conditio sine qua non) known to be defined as p (Epstein-Bar virus DNA ← Hodgkin’s lymphoma) [

χ 2 ( SINE ) ≡ ( ( | − c | − ( 1 2 ) ) 2 ( c + d ) ) + 0 = ( ( | − 11 | − ( 1 2 ) ) 2 ( 11 + 70 ) ) = 1 .361111111 (22)

Under these circumstances, the degree of freedom is d.f. = N − 1 = 2 − 1 = 1 . The critical X^{2} (significance level alpha = 0.05) is known to be 3.841458821 (^{2} value is 1.361111111and is less than the critical X^{2} = 3.841458821. Hence, our calculated X^{2} value is 1.361111111 and is not significant and we accept the null hypothesis. Due to this evidence, we do not reject the null hypothesis in favor of the alternative hypotheses. In other words, the sample distribution agrees with the hypothetical (theoretical) distribution. Our hypothetical distribution was the distribution of the necessary condition. Thus far, the data as published by Dinand et al. [

Q.e.d.

Claims.

Null hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is a conditio per quam of Hodgkin’s lymphoma.

(p_{0} > p_{Crit}).

Alternative hypothesis:

An infection of human lymph nodes by Epstein-Bar virus is not a conditio per quam of Hodgkin’s lymphoma.

(p_{0} < p_{Crit}).

Significance level (Alpha) below which the null hypothesis will be rejected: 0.05.

Proof.

The data of an infection by Epstein-Bar virus and Hodgkin’s lymphoma are viewed in the 2 × 2 table (

p ( EBV DNA → Hodgkin ’ slymphoma ) = ( 19 + 11 + 70 ) 100 = 100 100 = 1

The critical value p_{Crit} (significance level alpha = 0.05) is calculated [

p C r i t = 1 − 3 100 = 0 .97

The critical value is p_{Crit} = 0.97 and is less than the proportion of successes calculated as p(Epstein-Bar virus DNA → Hodgkin’s lymphoma) = 1. Due to this evidence, we do not reject the null hypothesis in favor of the alternative hypotheses. The data as published by Dinand et al. [

Q.e.d.

Claims.

Null hypothesis: (no causal relationship)

There is no causal relationship between an infection of human lymph nodes by Epstein-Bar virus and Hodgkin’s lymphoma.

Alternative hypothesis: (causal relationship)

There is a causal relationship between an infection of human lymph nodes by Epstein-Bar virus and Hodgkin’s lymphoma.

(k<>0).

Conditions.

Alpha level = 5%.

The two tailed critical Chi square value (degrees of freedom = 1) for alpha level 5% is 3.841458821.

Proof.

The data for this hypothesis test are illustrated in the 2 × 2 table (

k ( EBV DNA,Hodgkin ’ slymphoma ) = ( ( 100 × 19 ) − ( 30 × 19 ) ) ( 30 × 70 ) × ( 19 × 81 ) 2 = + 0 .739814235

The value of the test statistic k = +0.739814235 is equivalent to a calculated [

χ Calculated 2 = 100 × ( ( 100 × 19 ) − ( 30 × 19 ) ) ( 30 × 70 ) × ( 19 × 81 ) 2 × ( ( 100 × 19 ) − ( 30 × 19 ) ) ( 30 × 70 ) × ( 19 × 81 ) 2 χ Calculated 2 = 100 × 0 .739814235 × 0 .739814235 χ Calculated 2 = 54 .7325102881

The chi-square statistic, uncorrected for continuity, is calculated as X^{2} = 54.7325102881 and thus far equivalent to a P value of 0.000000000000138. The calculated chi-square statistic exceeds the critical chi-square value of 3.841458821 (

Q.e.d.

A case-control study or a retrospective study is a type of an observational study where investigators compare a set of people with a certain disease (the cases) and a set of people with all but this certain disease (the controls) with regard to a special condition, cause or factor. Case-control studies usually require a smaller sample sizes than equivalent cohort studies and are cheap and quick. As a consequence, many factors, conditions or causes can be studied simultaneously. Still, etiological questions are ideally studied not through the case-control approach. A cohort study is a better type of an observational study to investigate etiological hypothesis, especially when a study population, which is free of a disease, is used at the outset. By contrast to a case-control study, in a cohort study, it is investigated whether a disease develops or not. In particular, a case-control study may provide data which are inaccurate under certain circumstances and is very likely to suffer from bias error. Among many source of bias, the problems arise especially from the way how controls are sampled with the consequence that the data as collected in a case-control study may not be appropriate to perform some causal investigations of interest. To be persuasive, case-control studies need to be conducted very carefully. Further details about case control studies are given by secondary literature [

Epstein-Bar virus is the cause of Hodgkin’s lymphoma (k = +0.82841687, p Value = 0.000000000000000000000000108179).

The public domain software GnuPlot was used to draw the figure.

Barukčić, I (2018) Epstein Bar Virus―The Cause of Hodgkin’s Lymphoma. Journal of Biosciences and Medicines, 6, 75-100. https://doi.org/10.4236/jbm.2018.61008