TITLE:
Crowdsourced Sampling of a Composite Random Variable: Analysis, Simulation, and Experimental Test
AUTHORS:
M. P. Silverman
KEYWORDS:
Crowdsourcing, Computer Modeling of Crowds, Monte Carlo Simulation, Large-Scale Sampling, Log-Normal Random Variable, Log-Normal Distribution
JOURNAL NAME:
Open Journal of Statistics,
Vol.9 No.4,
August
15,
2019
ABSTRACT: A
composite random variable is a product (or sum of products) of statistically
distributed quantities. Such a variable can represent the solution to a multi-factor
quantitative problem submitted to a large, diverse, independent, anonymous
group of non-expert respondents (the “crowd”). The objective of this research
is to examine the statistical distribution of solutions from a large crowd to a
quantitative problem involving image analysis and object counting. Theoretical
analysis by the author, covering a range of conditions and types of factor
variables, predicts that composite random variables are distributed
log-normally to an excellent approximation. If the factors in a problem are
themselves distributed log-normally, then their product is rigorously
log-normal. A crowdsourcing experiment devised by the author and implemented
with the assistance of a BBC (British Broadcasting Corporation) television
show, yielded a sample of approximately 2000 responses consistent with a
log-normal distribution. The sample mean was within ~12% of the true count.
However, a Monte Carlo simulation (MCS) of the experiment, employing either
normal or log-normal random variables as factors to model the processes by
which a crowd of 1 million might arrive at their estimates, resulted in a
visually perfect log-normal distribution with a mean response within ~5% of the
true count. The results of this research suggest that a well-modeled MCS, by
simulating a sample of responses from a large, rational, and incentivized
crowd, can provide a more accurate solution to a quantitative problem than
might be attainable by direct sampling of a smaller crowd or an uninformed
crowd, irrespective of size, that guesses randomly.