Analysis of Japan and World Records in the 100 m Dash Using Extreme Value Theory

Extreme value theory provides methods to analyze the most extreme parts of data. We predicted the ultimate 100 m dash records for men and women for specific periods using the generalized extreme value (GEV) distribution. The various diagnostic plots, which assessed the accuracy of the GEV model, were well fitted to the 100 m records in the world and Japan, validating the model. The men’s world record had a shape parameter of −0.250 with a 95% confidence interval of [−0.391, −0.109]. The 100 m record had a finite limit and a calculated upper limit was 9.46 s. The return level estimates for the men’s world record were 9.74, 9.62, and 9.58 s with a 95% confidence interval of [9.69, 9.79], [9.54, 9.69], and [9.48, 9.67] for 10-, 100and 350-year return periods, respectively. In one year, the probability of occurrence for a new world record of men, 9.58 s (Usain Bolt), was 1/350, while that for women, 10.49 s (Florence Griffith-Joyner), was about 1/100, confirming it was more difficult for men to break records than women.


Introduction
Extreme value theory (EVT) has emerged as an important statistical discipline in applied science. EVT deals with statistical problems concerning the far tail of the probability distribution and is unique as a statistical tool since it develops models and techniques to describe the unusual event rather than the usual. Using EVT, the theoretical distribution and its population parameter that the maximum value follows are estimated from long-term observation data. Additionally, the maximum value or a large value that occurs once every century can be pre-How to cite this paper: Maruyama, F. Extreme value techniques are also becoming widely used for portfolio adjustment in the insurance industry, risk assessment on financial markets, and traffic prediction in telecommunications [1]. Statistical approaches focused on extreme values have shown promising results in forecasting unusual events in earth sciences, genetics, and finance. For instance, EVT was developed in the 1920s and has been used to predict the occurrence of events such as droughts and flooding [2] or financial risk [3] [4]. The application of extreme value modeling has been performed in the fields of ocean wave modeling [5], biomedical data processing [6], thermodynamics of earthquakes [7] [8], climatology [9], food science [10], and public health [11].
In this paper, we focus on one of the most popular events in athletics: the 100 m dash for both men and women. The 100 m world record has had evolution.
This study predicts 100 m records in the world and Japan using the extreme value theory.

Data
We used 100 m records for men and women in the world and Japan [14].

Method
EVT concerns with phenomena of extreme data. The method of block maxima was used. A method for modeling the extremes of a stationary time series is block maxima, in which consecutive observations are grouped into non-overlapping blocks of length n, generating a series of m block maxima, M n,1 , …, M n,m , to which the Generalized Extreme Value (GEV) distribution can be fitted for some large value of n. The usual approach is to consider blocks of a given time length, thus yielding maxima at regular intervals [1]. Here a block was considered as a year, i.e., annual maxima values were used. Although the method of block maxima is suitable for the analysis of maximum value data, it has the disadvantage of being easily affected by one realization value and having a large variance of the estimator.
When data are taken to be the maxima (or minima) over certain blocks of time (such as annual maximum precipitation), it is appropriate to use the GEV distribution:

F. Maruyama Journal of Applied Mathematics and Physics
where z is extreme values from blocks, μ a location parameter, σ a scale parameter, and ξ a shape parameter. G(z) is defined for all z such that (1 + ξ(z − μ)/σ) > 0 for ξ ≠ 0, and all z for ξ = 0. Three families of GEV distributions are defined depending on the value of ξ. For ξ > 0, we get the Frećhet distribution with a heavy tail, for ξ = 0, the Gumbel distribution with a lighter tail, and for ξ < 0, the Weibull distribution with a finite tail.
We want to know how small the value is as the fastest speed; hence it is necessary to multiply the data by -1 to put it in the framework of extreme value statistics that considers the maximum.
If a GEV distribution is fitted to observations, it becomes possible to estimate the probability of an event that has not yet been observed. Estimates of extreme quantiles of the annual maximum distribution are obtained by inverting Equa- where G(z p ) = 1 − p. z p is the return level associated with the return period 1/p, since to a reasonable degree of accuracy, z p is expected to be exceeded on average once every 1/p year. More accurately, z p is exceeded by the annual maximum in any particular year with probability p [1].
Modeling was performed using the evd package in R for the GEV calculations.
We also tried a non-stationary model in the GEV, but it did not work.

Results
The 100 m records for men in the world and Japan for 1970-2009 are shown in    Table 1 shows the GEV parameter estimates, which were the results of the GEV modeling on the 100 m records for men in the world using the block maxima method. The GEV parameters were estimated using the maximum likelihood estimation (MLE).

Men
The model has three parameters: location parameter, μ; scale parameter, σ; and shape parameter, ξ. Because ξ was negative, the 100 m records in the world had a finite upper limit. Table 2 shows the predicted maximum return levels for the return periods of 10, 20, 50, 100, and 350 years along with their respective 95% confidence intervals. For the 10-year return period, we estimated the return level to be 9.74 s, with a 95% confidence interval of [9.69, 9.79]. For the 100-year return period, we estimated the return level to be 9.62 s, with a 95% confidence interval of [9.54, 9.69]. Another way to interpret the plot is to say that there is approximately a 1% chance (1/100) each year that the 100 m record will not exceed 9.62 s. There is approximately a 10% chance (1/10) each year that the 100 m records will not exceed 9.74 s.
Various diagnostic plots for assessing the accuracy of the GEV model fitted to the 100 m records of men in the world are shown in Figure 2. Straight lines and curves are estimated functions. Each point plot is a realization value. The lines on both sides represent the 95% confidence interval. Neither the probability plot nor the quantile plot gives cause to doubt the validity of the fitted model: each set of plotted points is near-linear. The corresponding density estimate seems consistent with the data. Following the negative value of ξ, the tails are finite, and the return level curve is nonlinear. Various diagnostic plots gave little reason to doubt the validity of the GEV model. Table 3 shows the GEV parameter estimates, which were the results of the GEV modeling on the 100 m records of men in Japan using the block maxima method. Because ξ was negative, the 100 m records in the world had a finite upper limit. Table 4 shows the predicted maximum return levels. Various diagnostic plots for assessing the accuracy of the GEV model fitted to 100 m records for   men in Japan are shown in Figure 3. Various diagnostic plots gave the validity of the GEV model. Table 5 shows the GEV parameter estimates, which were the results of the GEV modeling on the 100 m records of women in the world using the method of block maxima. The value of ξ was close to zero (-0.0497) and included zero as a   confidence interval. Therefore, ξ can be regarded as zero. When ξ is zero, there is no upper limit, but the probability of taking a large value is small. The 100 m records in the world did not have a finite upper limit, but the probability of taking a small 100 m record was small. Table 6 shows the predicted maximum return levels for the return periods of 10, 20, 50, 100, and 500 years along with their respective 95% confidence intervals. Various diagnostic plots for assessing the accuracy of the GEV model fitted to the 100 m records for women in the world are shown in Figure 4. Part of the fluctuation in the upper data was not captured. In the return level curve, the estimated curve was close to linear because ξ was close to zero. Various diagnostic plots gave the validity of the GEV model. Table 7 shows the GEV parameter estimates, which were the results of the GEV modeling on the 100 m records of women in Japan using the method of block maxima. Because ξ was negative, the 100 m records in the world had a finite upper limit. Table 8 shows the predicted maximum return levels. The various diagnostic plots for assessing the accuracy of the GEV model fitted to the 100 m records for women in Japan are shown in Figure 5. Various diagnostic plots gave the validity of the GEV model.

Discussion
The return level plot for the 100 m records for men is shown in Figure 6. In the F. Maruyama  ξ < 0 case, for men in the world and Japan, the plots deviated from the straight line and were convex downward. The calculated upper limit was 9.46 and 9.91 s in the world and Japan, respectively. For the 350-year return period, the return level for men in the world was obtained as 9.58 s, with a 95% confidence interval of [9.48, 9.67]. Hence, the probability of occurrence in one year for the record of Usain Bolt, 9.58 s, which was the 1st record in the world was 1/350. Einmahl estimated the ultimate world record and found 9.51 s for men [13].

F. Maruyama Journal of Applied Mathematics and Physics
The return level plot for the 100 m records for women in the world and Japan is shown in Figure 6. The slope of the approximate straight line for women in the world was the largest. That is, the fluctuation range by year was large. The ξ = 0 case had a heavy-tailed distribution. In the ξ < 0 case, the plots for the Japanese women's records deviated from the straight line with a downward convex shape. To break a record, a runner should accelerate quickly and maintain the maximum speed. Mechanically, acceleration, a, is determined by the angle θ at which the body is tilted with respect to the ground and can be expressed as where g is the gravitational acceleration. To increase the acceleration, the body tilt should be increased and θ should be reduced.

Conclusions
Extreme value theory provides methods to analyze the most extreme parts of data. Here, we used the GEV distribution to predict the ultimate 100 m dash records for men and women for specific periods. The results are summarized as follows: 1) The various diagnostic plots, which assessed the accuracy of the GEV model, were well fitted to the 100 m records in the world and Japan, validating the model.
2) The men's world record had a shape parameter of −0.250 with a 95% confidence interval of [−0.391, −0.109]. The 100 m record had a finite limit and a calculated upper limit was 9.46 s. The calculated upper limit of the gold medalist for men was 9.58 s, which is equal to the record of Usain Bolt.
3) The return level estimate for men in the world was obtained as 9.74, 9.62, and 9.58 s, with a 95% confidence interval of [9.69, 9.79], [9.54, 9.69] and [9.48, 9.67] for 10-, 100-and 350-year return periods, respectively. The probability of occurrence in one year for a new world record of men, 9.58 s (Usain Bolt), was 1/350, while that for women, 10.49 s (Florence Griffith-Joyner), was about 1/100, confirming it was more difficult for men to break records than women.