1. Introduction
There are commonly used, continuous probability distributions of one variable, such as the normal distribution and the exponential distribution. Likewise, there are discrete distributions that are well established, such as the binomial distribution and the Poisson distribution. Here, the much less routine situation in which there is a discrete component and a continuous component in a single probability distribution is addressed. These are called mixed-type probability distributions. Often, the components can be entirely separated from each other, but sometimes it might reduce the effectiveness of the probability model to do so.
An example of a mixed-type distribution is the lifetimes of electronic components. Some components may have zero lifetimes, because they are defective from the onset, giving a discrete probability component at zero ( [1], p. 72-73, 121), while a continuous component is used for the remaining lifetimes. Biological lifetimes can have this feature ( [2], p. 34).
Another example is the elapsed time at a stop sign at an intersection on a street. Some drivers will spend no time at the sign after stopping, because there is no cross traffic, giving the discrete component at elapsed time zero. Other drivers will linger until traffic clears, supplying the continuous component ( [3], p. 63) ( [4], p. 98-99).
A third example of a mixed-type distribution is lifetimes in an experiment that is terminated at a predetermined time td. The complete lifetimes of those subjects still living or objects that have not yet failed cannot be known. That group produces a discrete component at time td ( [2], p. 52-63) ( [5], p. 97-98).
On occasion, there might be times during an experiment, or in the course of events, at which interventions can introduce discrete components. For instance, a planned medical procedure, mass vaccination, or military campaign could be such an intervention.
There are choices for the method for proceeding. One choice is to standardize the continuous portion, so that it has a probability density. This is usually expressed with conditional distributions ( [2], p. 52-63) ( [5], p. 97-98). The discrete portion might be standardized separately or ignored. Another choice, which is to proceed with a mixed-type probability distribution for the whole experiment, is focused on presently.
For a discrete distribution, sums are used to compute probabilities and expected values. For continuous distributions that have a probability density function, Riemann integration is used. All discrete, continuous, and mixed-type distributions, including continuous distributions without a probability density function, which are discussed in Section 3, are covered simultaneously with Riemann-Stieltjes integration ( [1], p. 118-126), ( [2], p. 11-14, 34), ( [6], p. 281-284).
2. An Example of a Mixed-Type Distribution
Consider the following example.
Example 1 (mixed-type distribution). In a populous state in the USA, it has been determined by using surveys that there are three distinct types of voters for an upcoming ballot initiative. One type of voter definitely opposes the initiative and is believed to be 7/20 of the voters. These individuals are coded x = 0. Another type is definitely in favor of the initiative and is 2/5 of the voters. Those individuals are coded x = 1. The third type is not polarized on the initiative, and their degree of support x is between zero and one. They are the remaining 1/4 of the voters. For these voters, the proportion of the population is modeled with
. The random variable X is an individual voter’s degree of positive support for the ballot initiative.
The
was obtained by first fitting, via a smoothed histogram from the sample’s non-polarized voters, a curve that represents the number of votes as a function of degree of support. The fit was a linear function such that the intercept term is twice the slope coefficient. Then, the values 1/5 and 1/10 were determined from the requirement that the integral of the linear function from x = 0 to x = 1 must equal the remaining fraction, 1/4, of the voters.
This is a mixed-type distribution with a discrete part for X = 0 and X = 1 and a continuous part for
. For every random variable X, the cumulative distribution function (cdf) is defined on
by
. A cdf F(x) has a jump discontinuity at x = a when
. Any discrete cdf has at most a countable number of jump discontinuities ( [1], p. 74) ( [7], p. 71). The continuous part of Example 1 is absolutely continuous. A cdf F(x) is defined to be absolutely continuous if there exists a nonnegative probability density function (pdf) f(x) that has
as its domain and
, where the integral is a Riemann integral ( [7], p. 127) ( [8], p. 139-140). The derivative of an absolutely continuous cdf is the pdf.
The cumulative distribution function for Example 1 is
, (1)
which is displayed in Figure 1.
Use the fact that any cdf can be decomposed uniquely into a convex sum of a discrete cdf and a continuous cdf giving the Jordan decomposition
, (2)
where c1 ≥ 0, c2 ≥ 0, c1 + c2 = 1, Fd(x) is a discrete cdf, and Fc(x) is a continuous cdf ( [1], p. 121) ( [5], p. 88-90) ( [8], p. 138). Then,
and
, (3)
which yield the probability mass function (pmf) and probability density function (pdf)
and
, (4)
![]()
Figure 1. The cumulative distribution function y = F(x) for X in Example 1.
respectively, and c1 = 3/4 and c2 = 1/4. The weight c1 might be most easily computed from the jumps in the original cdf F(x). Then, Pr(X = 0) + Pr(X = 1) = 7/20 + 2/5 = 3/4, which is the divisor of each of the jumps 7/20 and 2/5 from (1), in order to obtain the discrete parts of (3) and (4) and the multiplier c1. For the continuous part, the normalizing divisor, and multiplier as well, is c2 = 1 – c1 = 1/4 or
. Alternatively, obtain Fc from
.
Use the property that the expectation of the function g(X) is
.
where the expectations on the right-hand side are with respect to the similarly superscripted pmf and pdf ( [1], p. 121) ( [3], p. 69). The left-hand side would be computed as a Riemann-Stieltjes integral, but, in Example 1, the right-hand side contains a summation and a Riemann integral. Thus, direct consideration of a Riemann-Stieltjes integration is sidestepped in this example. This formulation has the advantage of exhibiting the way that the expected value is a weighted average of the expectations with respect to the discrete and the continuous components. For Y = g(X) = X, the expected value is
.
3. Singular Continuous CDFs
Any cdf F(x) can be decomposed uniquely into a convex sum of a discrete cdf and a continuous cdf, as in (2). Further, the continuous component can be uniquely decomposed into an absolutely continuous component Fac(x) and a singular continuous component Fsc(x), giving the Lebesque decomposition
,
where c1, c2 , c3 ≥ 0, c1 + c2 + c3 = 1 ( [7], p. 131) ( [8], p. 142-143) ( [9], p. 10-12). A function is singular continuous if it is a continuous function that is not identically zero and whose first derivative exists and equals zero almost everywhere ( [7], p. 131, 146-149) ( [8], p. 141) ( [9], p. 11). The main example is the Cantor distribution, but others, such as Minkowski’s singular continuous distribution, are well-known [10] [11]. The phrase “continuous random variable” refers to a random variable that has a cdf that is everywhere continuous.
The Cantor set is created by an infinite process. Beginning with the closed interval [0,1], during the nth step of the process, remove the 2n−1 middle-third open intervals, each of which has length 1/3n. After doing that step, there remain 2n disjoint, closed intervals. The infinite intersection of the closed sets is the Cantor set. The sum of the lengths of the deleted intervals is one, so the Cantor set has Lebesgue measure zero. The Cantor set is the support of the Cantor distribution, whose cdf is called the devil’s staircase. This cdf fails to be differentiable at every point of the Cantor set, but its derivative is zero on the set’s complement. Thus, probabilities cannot be recovered by integrating the derivative of the cdf. The devil’s staircase has no jumps, and so it is continuous at every real number. It is singular continuous, because it assigns probability one to the Cantor set, which has Lebesgue measure, i.e. length, zero. The devil’s staircase has no discrete component and no absolutely continuous component. The Cantor distribution and the devil’s staircase appear in the probability and statistics literature ( [7], p. 146-149) ( [8], p. 35-36, 141, 146, 593), ( [9], p. 13-15, 129, 174) [12] and the mathematical modeling and real analysis literature ( [6], p. 80-84, 90) [10] [11] ( [13], p. 249). It is the basis of Example 5 in Section 4.4.
The first seven omitted intervals, where the devil’s staircase has slope zero, and the accompanying values of the cdf are
and are graphed in Figure 2.
4. Medians
A median of the random variable X, and therefore of F, is any real number m such that
and
,
or, equivalently,
and
. (5)
For Example 1, the median of X is
, (6)
which is obtained from (1) by solving
.
![]()
Figure 2. Portions of the devil’s staircase, which is the cdf of the Cantor distribution.
The main purpose of this section is to show that, if m is a median of the univariate random variable X and
, then
, and this inequality is strict if a is not a median of X. To avoid trivialities, assume that these expectations exist as finite real numbers, which is assured by presupposing that
. Mood, Graybill, and Boes ( [3], p. 83), Hogg, McKean, and Craig ( [14], p. 58), and Parzen ( [15], p. 213) consider this inequality for absolutely continuous cdfs. Rohatgi ( [5], p. 170-171) and Dwass ( [16], p. 341-342) consider it separately for discrete and for absolutely continuous cdfs. The advantage of using Riemann-Stieltjes integration is that it covers the inequality for any discrete, continuous, and mixed type cdfs without exception with a single argument, which is presented in Theorem 1.
4.1. Preliminaries
The expectation
is a convex function of x. Indeed, for
and
, using the triangle inequality and the linearity of expectation,
All convex functions are continuous ( [6], p. 199) ( [17], p. 149-152).
For
, define
.
Because
,
( [14], p. 38). Since F is right continuous ( [1], p. 71) ( [7], p. 70-71),
.
Thus, (5) can be expressed
and
. (7)
Assuming that g is a continuous positive function on the interval (c, d) and
,
(8)
( [1], p. 118-119) ( [6], p. 281-284). For
,
.
4.2. Lemma 1
Lemma 1. Let X be a random variable with the cumulative distribution function of F. Suppose that
. Then, for any a and
,
Proof. Observe that the assumption that
is equivalent to E(|X|) < ∞. If
, then
and
.
Consider two cases.
Case 1 (a < b). Expanding gives
and
Substituting gives
which yields (9).
Case 2 (b < a). Interchanging the roles of a and b in (9) gives
Multiplying by –1 yields (10).
4.3. Theorem 1
Theorem 1. Let X be a random variable. Suppose that
and
is a median of X. Then, for any
,
(11)
The inequality is strict if a is not a median of X.
Proof. From the proof of Lemma 1,
. Set b = m in Lemma 1. The integrals in (9) and (10) are nonnegative. Inequality (11) follows, because, using (7), the second terms on the right-hand sides of (9) and (10) are also nonnegative. To show that the inequality (11) is strict when a is not a median of X, consider the two cases in Lemma 1.
Case 1 (a < m). From (7), either F(m–) < 1/2 or F(m–) = 1/2. If F(m–) < 1/2, then (m – a)(1 – 2F(m–)) > 0, so that the right-hand side of (9) is positive. If F(m–) = 1/2, then (m – a)(1 – 2F(m–)) = 0. Also, F(a) ≤ F(m–) = 1/2. Because a is not a median, F(a) ≠ 1/2 and, thus, F(a) < 1/2, F(a) < F(m–), and the integral in (9) is positive from (8).
Case 2 (m < a). From (5), either 1/2 < F(m) or 1/2 = F(m). If 1/2 < F(m), then (a – m)(2F(m) – 1) > 0, so that the right-hand side of (10) is positive. If F(m) = 1/2, then (a – m)(2F(m) – 1) = 0. Also, 1/2 = F(m) ≤ F(a–). Because a is not a median, F(a–) ≠ 1/2 and, thus, F(a–) > 1/2, F(m) < F(a–), and the integral in (10) is positive from (8).
4.4. Representative Examples
The following examples display the function
and the locations of the medians for various distributions. The graphs illustrate that
is a convex and continuous function and that medians occur as single points or as all of the values in an interval.
Example 1 revisited (mixed-type distribution). For the voter preference example,
,
which is displayed in Figure 3. The minimum occurs at
, as computed in (6).
Example 2 (absolutely continuous cdf). For the exponential random variable with pdf f(x) = e−x/3/3 for x > 0 and zero otherwise, Figure 4 displays
.
![]()
Figure 3.
for the mixed-type distribution in Example 1.
![]()
Figure 4.
for the exponential distribution with mean μ = 3 in Example 2.
The expected value is μ = 3. The minimum is at the unique median m = 3ln2 ≈ 2.08.
Example 3 (discrete cdf and a single median). For the binomial distribution with n = 3 and p = 0.7, the expected value is μ = np = 2.1. Figure 5 displays
.
The minimum is at the unique median m = 2.
![]()
Figure 5.
for the binomial distribution with n = 3 and p = 0.7 in Example 3.
Example 4 (discrete cdf and an interval of medians). For the binomial distribution with n = 5 and p = 0.5, the expected value is μ = np = 2.5. Figure 6 displays
.
Note that every number in the interval [2, 3] is a median.
Example 5 (singular continuous distribution). The Cantor distribution has mean 0.5 and median any
. Because the derivation of an expression for
is more complicated than the previous examples, details are presented.
During the nth step of the process that leads to the Cantor set, remove the 2n−1 middle-third open intervals, each of which has length 1/3n. After doing such a step, there remain 2n disjoint, closed intervals, which are denoted by
for
, where
The following lemma provides a recursive formula that is used for computing
.
![]()
Figure 6.
for the binomial distribution with n = 5 and p = 0.5 in Example 4.
Lemma 2. Define
and
for n ≥ 1. Then
and
.
Proof. Because
and
for
( [9], p. 15),
and
.
Hence,
Thus,
.
Consider n ≥ 1. For 1 ≤ k ≤ 2n−1,
and
Let 2n−1 + 1 ≤ k ≤ 2n. From
and
it follows that
The numerical values of the definite integrals
for n = 1, 2, and 3 are in Table 1.
Theorem 2. Let
be the disjoint open intervals that are removed during the nth step of the construction of the Cantor set, where the midpoints of these intervals are strictly increasing. For
,
(12)
Proof. For
, it follows from Lemma 2 that
Since the complement of the Cantor set in [0, 1] is dense in that interval, and since the value of a continuous function at any number in [0, 1] is determined by its values on a dense subset ( [6], p. 121), Theorem 2 determines the value of
for
. Additionally, y(x) = 1/2 – x for every x < 0 and y(x) = x – 1/2 for x > 0.
For
, the graph of y on
is a line segment with slope
.
Equation (13) and Figure 7 display the values and give the graph of y = y(x) for the Cantor distribution for
with n = 1, 2, and 3.
![]()
Table 1. Numerical values of
for n = 1, 2, 3 and
.
(13)
![]()
Figure 7. A portion of the graph for
for the Cantor distribution in Example 5.
For a sample calculation, using (12) and Table 1, take n = 2 and k = 1 in (13),
5. Concluding Comments
The necessary and sufficient condition that x minimizes
for medians of the distribution of X has been established under completely general conditions on the distribution of X and illustrated for various pure and mixed cumulative distribution functions, including the devil’s staircase of the Cantor distribution. We have evaluated the function
when X has the Cantor distribution, and we have also demonstrated the way in which the nature of this function depends heavily on the distribution of X in other cases. By way of contrast, the minimum value of the quadratic function
is Var(X), and the minimizing value of x is E(X). Its graph depends only on this mean and variance, and any two distributions for X yield graphs that are translates of each other.
Acknowledgements
The authors want to thank the anonymous referees for many insightful comments.