Distribution of the Maximum and Minimum of a Random Number of Bounded Random Variables

We study a new family of random variables, that each arise as the distribution of the maximum or minimum of a random number $N$ of i.i.d.~random variables $X_1,X_2,\ldots,X_N$, each distributed as a variable $X$ with support on $[0,1]$. The general scheme is first outlined, and several special cases are studied in detail. Wherever appropriate, we find estimates of the parameter $\theta$ in the one-parameter family in question.


Introduction and General Scheme
and of a random number of i.i.d. random variables? Now the sum S of a random number of i.i.d. variables, defined as satisfies, according to Wald's Lemma [3], the equation provided that N is independent of the sequence {X i } and assuming that the means of X and N exist. The purpose of this paper is to show that the distributions in (1) and (2) can be studied in many canonical cases, even if N and {X i } ∞ i=1 are correlated. The main deviation from the papers [9], [8] and [10], where similar questions are studied, is that the variable X is concentrated on the interval [0, 1] -unlike the above references, where X has lifetime-like distributions on [0, ∞). Even then, we find that many new and interesting distributions arise, none of them to be found, e.g., in [5] or [6]. In another deviation from the theory of extremes of random sequences (see, e.g., [7]), we find that the tail behavior of the extreme distributions is not relevant due to the fact that the distributions have compact support. We next cite three examples where our methods might be useful. First, we might be interested in the strongest earthquake in a given region in a given year. The number of earthquakes in a year, N, is usually modeled using a Poisson distribution, and, ignoring aftershocks and similarly correlated events, the intensities of the earthquakes can be considered to be i.i.d. random variables in [a, b] whose distribution can be modeled using, e.g., the data set maintained by Caltech at [4]. Second, many "small world" phenomena have recently been modeled by power law distributions, also sometimes termed discrete Pareto or Zipf distributions. See, for example, the body of work by Chung and her co-authors [2], [1], and the references therein, where vertex degrees d(v) in "internet-like graphs" G (e.g., the vertices of G are individual webpages, and there is an edge between v 1 and v 2 if one of the webpages has a link to the other) are shown to be modeled by Thus if the vertices v in a large internet graph have some bounded i.i.d. property X i , then the maximum and minimum values of X i for the neighbors of a randomly chosen vertex can be modeled using the methods of this paper. Third, we note that N and the X i may be correlated, as in the CSUG example (studied systematically in Section 3) where X i ∼ U[0, 1] and N = inf{n ≥ 2 : X n > (1 − θ)} follows the geometric distribution Geo(θ). This is an example of a situation where we might be modeling the maximum load that a device might have carried before it breaks down due to an excessive weight or current. It is also feasible in this case that the parameter θ might be unknown.
Here is our general set-up: Suppose X 1 , X 2 , . . . are i.i.d. random variables following a continuous distribution on [0, 1] with probability density and distribution functions given by f (x) and F (x) respectively. N is a random variable following a discrete distribution on {1, 2, . . .} with probability mass function given by P(N = n) = p(n), n = 1, 2, . . . . Let Y and Z be given by (1) and (2) respectively. Then the p.d.f.'s g of Y and Z are derived as follows: Since we see that g(y|N = n) = n[F (y)] n−1 f (y), and consequently, the marginal p.d.f. of Y is In a similar fashion, the p.d.f. of Z can be shown to be what is remarkable is that the sums in (3) and (4) will be shown to assume simple tractable forms in a variety of cases.
Our paper is organized as follows. In Section 2, we study the case of X ∼ U[0, 1] and N ∼ Geo(θ). We call this the Standard Uniform Geometric model. The CSUG (Correlated Standard Uniform Model) is studied in Section 3. Section 4 is devoted to a summary of a variety of other models.
Proposition 2.2. The random variable Y has mean and variance given, respectively, by Proof. Using Proposition 2.1, we can directly compute the mean and variance by setting k = 1, 2, and using the fact that Proof.
as asserted.
Proposition 2.4. The random variable Z has mean and variance given, respectively, by Proof. Using Proposition 2.3, it is easily to compute the mean and variance by setting k = 1, k = 2.
The m.g.f.'s of Y, Z are easy to calculate too. Notice that the logarithmic terms above arise due to the contributions of the j = 1 and j = k − 1 terms, and it is precisely these logarithmic terms that make, e.g., method of moments estimates for θ to be intractable in a closed (i.e., non-numerical) form. Similar difficulties arise when analyzing the likelihood function and likelihood ratios.

The Correlated Standard Uniform Geometric (CSUG) Model
The Correlated Standard Uniform Geometric (CSUG) model is related to the SUG model, as the name suggests, but X and N are correlated as indicated in Section 1. The CSUG problems arise in two cases. One case is that we conduct standard uniform trials until a variable X i exceeds 1 − θ, where θ is the parameter of the correlated geometric variable, and the maximum of X 1 , X 2 , · · · , X i−1 is what we seek. The maximum is between 0 and 1 − θ.
The other case is where standard uniform trials are conducted until X i is less than θ, and we are looking for the minimum of X 1 , X 2 , · · · , X i−1 . The minimum is between θ and 1.
Specifically, let X 1 , X 2 , · · · be a sequence of standard uniform variables and define N = inf{n ≥ 2 : In either case N has probability mass function given by note that this is simply a geometric random variable conditional on the success having occurred at trial 2 or later. Clearly N is dependent on the X sequence.  (1), is given by Proof. The conditional c.d.f. of Y given that N = n is given by Taking the derivative, we see that the conditional density function is given by This completes the proof.

Proposition 3.2. The p.d.f. of Z under the CSUG model is given by
Proof. The conditional cumulative distribution function of Z given that N = n is given by Thus, the conditional density function is given by which yields the p.d.f. of Z under the CSUG model as which finishes the proof. Proof.
Proposition 3.4. The random variable Y has mean and variance given, respectively, by Proof. Using Proposition 3.3, we can directly compute the mean and variance by setting k = 1, 2. Notice that the variance of Y is smaller than that of Y under the SUG model, with an identical numerator term. Also, the expected value is smaller under the CSUG model than in the SUG case.
Proposition 3.5. If the random variable Z has the "CSUG Minimum distribution" and k ∈ N, then Proof. Routine, as before.
Proposition 3.6. The random variable Z has mean and variance given, respectively, by Proof. A special case of Proposition 3.5; note that as in the SUG model,

Parameter Estimation
The intermingling of polynomial and logarithmic terms makes method of moments estimation difficult in closed form, as in the SUG case. However, if θ is unknown, the maximum likelihood estimate of θ can be found in a satisfying form, both in the CGUG maximum and CSUG minimum cases.

A Summary of Some Other Models
The general scheme given by (3) and (4) is quite powerful. As another example, suppose (using the example from Section 1) that p(n) = 6 π 2 1 n 2 and X ∼ U[0, 1]. Then it is easy to show that and that E(Y ) = 6 π 2 . (The expected value of Y can also be calculated by using the identity E(Y ) = E(E(Y |N)). In this section, we collect some more results of this type, without proof: UNIFORM-POISSON MODEL Here we let X ∼ U[0, 1] and p(n) = e −λ λ n (1−e −λ )n! , n = 1, 2, . . ., so that N follows a left-truncated Poisson distribution. .
In some sense, the primary motivation of this paper was to produce extreme value distributions that did not fall into the Beta family (such as f (y) = nt n−1 for the maximum of n i.i.d. U[0, 1] variables). A wide variety of non-Beta-based distributions may be found in [6]. Can we add extreme value distributions to that collection? In what follows, we use both the Beta families B(2, 2) and B(1/2, 1/2), the arcsine distribution, and a "Beyond Beta" distribution, the Topp-Leone distribution, as "input variables" to make further progress in this direction.

Acknowledgments
The research of AG was supported by NSF Grants 1004624 and 1040928.