Bayesian Posterior Predictive Probability Happiness

We propose to determine the underlying causal structure of the elements of happiness from a set of empirically obtained data based on Bayesian. We consider the proposal to study happiness as a multidimensional construct which converges four dimensions with two different Bayesian techniques, in the first we use the Bonferroni correction to estimate the mean multiple comparisons, on this basis it is that we use the function t and a z-test, in both cases the results do not vary, so it is decided to present only those shown by the t test. In the Bayesian Multiple Linear Regression, we prove that happiness can be explained through three dimensions. The technical numerical used is MCMC, of four samples. The results show that the sample has not atypical behavior too and that suitable modifications can be described through a test. Another interesting result obtained is that the predictive probability for the case of sense positive of life and personal fulfillment dimensions exhibit a non-uniform variation.


Introduction
Bayesian networks emerged about three decades ago as alternatives to conventional systems-oriented decisionmaking and forecasting under uncertainty in probabilistic terms [1].A Bayesian network is a statistical tool that represents a set of associated uncertainties given conditional independence relationships established between them [2] [3].
The rule Bayes is a rigorous method for interpreting evidence in the context of previous experience or knowledge.The Bayes rule has recently emerged as a powerful tool with a wide range to applications which include: genetics, image processing, ecology, physics and engineering.The essential characteristics of Bayesian methods are their explicit use of probability for quantifying uncertainty in inferences base on statistical data analysis.A Bayesian probability interval for an unknown quantity of interest can be directly regarded as having a high probability of containing the unknown quantity, in contrast to a frequentist confidence interval which may strictly be interpreted only in relation to a sequence of similar inferences that might be made in repeated practice.
Using Bayesian networks as a tool for data analysis is not widespread in the context of Psychology.However, the benefits of using this tool in all areas of psychology are identified in several ways: on the economic front with Bayesian networks could develop systems to make judgments appropriate chance to improve diagnosis and psychological treatment.On the scientific level, Bayesian networks cannot be overlooked if psychology strives to clarify the mechanisms by which people evaluate, decide and make inferences; as they may serve as analytical and theoretical reference in the development of models of reasoning, learning and perception of uncertainty [3].
In psychology, the area's most prolific work in the use of Bayesian networks has been the causal learning [4]- [7], where they test the hypothesis that people represent causal knowledge similarly to as it does a Bayesian network.Other theoretical developments have made possible to extend the use of Bayesian network outside the domain of causal knowledge, for instance, on inductive generalization of concepts and learning words [8].Another facet of Bayesian networks which have proved useful tools is in market research [9].Others found that a Bayesian network classifies customers of a company relative to the perspective of long-term purchases [10].On the other hand, it analyzed the nature of human emotions and the negative behavior resulting from overcrowding during mass events.Utilizing the Bayesian network, his model shows the dependence structure between different emotions and negative behaviors of pilgrims in the crowd [11].
Finally, by its shaping power, the Bayesian networks could be used to generate more and better models of how organizations, groups or social aggregate which is the subject of study for psychology.Thus the aim of this study was to determine the underlying causal structure of the elements of happiness from a set of empirically obtained data.It is considered a Bayesian statistical parameter, which is inferred as an uncertain event, in this study the knowledge about happiness is not accurate and is subject to uncertainty, therefore happiness can be described by a probability distribution.Whereby the Metropolis algorithm used is a specific type of process Monte Carlo, which generates a random way so that every step along the way is completely independent of the previous steps of the current position and generates Markov chains.Process in each step has not memory of the previous states.This is known as Markov Chain Monte Carlo (MCMC) Happiness has been defined as an entity that can be described by a specific set of measures [12], [13], a mental state that people can gain control over in a cognitive way to perceive and conceive both themselves and their world as an experience of joy, satisfaction or positive welfare [14].Unfortunately, terms like happiness have been used frequently in daily discourse and may now have vague and somewhat different meanings.
The difficulty of defining happiness has led pioneer psychologists in the study of happiness propose the term subjective well-being (SWB).SWB refers to people's evaluations of their own lives and encompasses both cognitive judgments of satisfaction and affective appraisals of moods and emotions.This conceptualization emphasizes the subjective nature of happiness and holds individual human beings to be the best judges of their own happiness [15]- [17].
There is empirical evidence indicating that well-being is a much broader construct than stability of emotions and subjective judgment about life satisfaction; e.g.situational models consider that the sum of happy moments in life results in the satisfaction of people [18], that is, a person exposed to a greater amount of happy events will be more satisfied with his or her life.People briefly react to good and bad events, but in a short time they return to neutrality.Thus, happiness and unhappiness are merely short-lived reactions to changes in people [19] and it depends strongly on intentional activity [20], [21].But also it has been identified the temperament is suggested to influence happiness [22], [23] and other personality traits such as optimism and self-esteem [24]- [26], the self-determination [27].
Scholars have noticed that happiness is not a single thing, but it can be broken down into its constituent elements.Considering this information, Alarcón [28] proposes to study happiness as a multidimensional construct which converges satisfaction of what has been achieved, positive attitudes toward life, (experiences that reflect positive feelings concerning one's self and life) personal fulfillment, and joy of living.
Happiness Scale of Lima (HSL) [28], consists of 27 items and reported item-scale correlations were highly significant and high internal consistency (∝ = 0.92).The factorial analysis of principal components and varimax rotation, revealed that happiness is a multidimensional behavior, consisting of four dimension: 1 X Sense positive of life, 2 X Satisfaction with life, 3 X Personal fulfillment and 4 X Joy of living.

Definition 1
The σ-algebra is used for measured definition.A probability measure is a mapping to any event, with the properties ( ) ( ) A random variables es given by : Ω X →  where  is measurable and subset of  , we denote by

Definition 3
The random samples, say have the same underlying distribution.
We assume that we have four data samples { } , , , X X X X .If the data have the same distributions, then . Uncertainty about the true value of parameter it described by a measurement Π conditional observed datasets.The posterior predictive distribution, is defined as Let  be a probability measure on ( ) , , , : The probability for 1 is equal to the joint probability ( ) , the probability is bounded by { } ( ) Consider the problem of selecting independent samples from several populations for the purpose of between-group comparisons, either through hypothesis testing or estimation of mean differences.A companion problem is the estimation of within-group mean levels.Together, these problems form the foundation for the very common analysis of variance framework, but also describe essential aspects of stratified sampling, cluster analysis, empirical Bayes, and other settings.Procedures for making between-group comparisons are known as multiple comparisons methods.The goal of determining which groups have equal means requires testing a collection of related hypotheses [29].
Consider independent samples from l normally distributed populations with equal variances (3a y 3b) ( ) , a b then there is ( ) σ , with the individual means restricted by the model defined clustering of equalities.That is, each model determines a corresponding parameter space where particular means are taken as equal. Let , where θ take values in a d-dimensional parameter space Θ , be likelihood θ can be written as You can write the posterior probability (4) y (5) as, We use a uniform prior for 1 we using equations (5,6) and the previous development, we write the following relationship ( ) ( )

Definition 4
The Taylor series of a real or complex-valued function ( ) f x that is infinitely differentiable at a real or complex numberis the power series ∑ Then, applying the definition 4 to (7), we obtain The Fisher information [30] is a way of measuring the amount of information that an observable random variable X carries out unknown parameter θ.

( )
Applying ( 9) to (8), is obtained With respect to the candidate model class { } , we obtain, the posterior model probabilities ( ) Many clever methods have been devised for constructing and sampling from arbitrary posterior distributions.Markov chain simulation (also called Markov chain Monte Carlo, or MCMC) is a general method based on drawing values of θ from approximate distributions and then correcting those draws to better approximate the target posterior distribution, ( ) . The sampling is done sequentially, with the distribution of the sampled draws depending on the last value drawn; hence, the draws form a Markov chain.(As defined in probability theory, a Markov chain is a sequence of random variables 1 2 , , , θ θ  for which, for any t, the distribution of 1 t θ − given all previous θ's depends only on the most recent value) The key to the method's success, however, is not the Markov property but rather that the approximate distributions are improved at each step in the simulation, in the sense of converging to the target distribution [35].
A z-statistic should be calculated when the standard deviation of the population(s) is known.If the standard deviation is not known, then the standard error must be estimated using the standard deviation of the sample(s).Due to this estimation, we must use the t-distribution which is thicker in the tails to account for estimating the standard error with the sample standard deviation [34].
Until now, we have built the theory to apply to the case of four sample data, now what we will do, will be a particular case study, and see that Lema and definitions 1, 2 y 3 are applied naturally like the Equation ( 14) Example particular case Suppose : . Select a random sample of size i n .From ith group ( ) 1, 2,3, 4 i = so sample sizes are ( ) , , , n n n n x is the sample mean for the observations in all group combined 4 1 1 Variability in the data, the deviation of an individual observation

∑∑ ∑∑ ∑∑
To test the null hypothesis that the population means are all the same, us the test statistic H , this statistic has t distribution with k-1 and n-k degree freedom.Now four three Suppose 0 1 23 : . Select a random sample of size n i .From ith group ( ) 1, 2, 3 i = so sample sizes are ( ) ) The principle of Bayes model is to compute posteriors bases on specified priors and the likelihood function of data, the four groups of size 1110.
We began with a descriptive model of data from four groups, wherein the parameters were meaningful measures of central tendency, variance, and normality.Bayesian inference reallocates credibility to parameter values that are consistent with the observed data.The posterior distribution across the parameter values gives complete information about which combinations of parameter values are credible.In particular, from the posterior distri-bution we can assess the credibility of specific values of interest, such as zero difference between means, or zero difference between standard deviations.We can also decide whether credible values of the difference of means are practically equivalent to zero, so that we accept the null value for practical purposes.
The Bayesian posterior distribution can also be used as a complete hypothesis for assessing power, that is, the probabilities of achieving research goals such as rejecting a null value, accepting a null value, or reaching a desired precision of estimation.The power estimation incorporates all the information in the posterior distribution by integrating across the credible parameter values, using each parameter-value combination to the extent it is credible.Figure 1 shows histograms of data that are labeled with i X on their abscissas, and these data are fixed at their empirically observed values.

Bayesian Multiple Linear Regression
now, we need to find an adjustment function for the above data.A linear regression model where more than one variable involved is called multiple regression model [36]- [38] where, we have k regressors, parameters β they are regression coefficients, then the Approach Bayesian Mul- tiple Linear Regression is [36] [37] ( ) ( ) The prior distribution of ( )  ) ( ) ( ) ( ) The conjugate prior distribution will be given by We can see that both matrices have two terms, one that only it depends on the prior and other that only it depends of data.This is very useful because in each iteration.We have to update only the last term.The question to be dealt is the choice of hyperparameters 2 C y 2 θ ( ) Then, we can write the equation as follows [38] ( Figures 2-5 show the range HDI, which means Higher Density Interval.Values inside HDI have a greater probability density (credibility) that values outside this.Therefore, the 95% HDI includes the most incredible parameters values.There is a way that the posterior 95% HDI could exclude zero even when the data have a frequency of zero.It can happen if the prior already excludes zero.This interval is useful as a summary of the distribution and decision tool.The decision rule is simple.Any value outside of the 95% HDI is rejected [38].
In all four cases the focus is on assessing if the predictors were differentially predictive, for which we examine the posterior distribution of the differences standardized regression coefficients, given that the comparison is based in the normalization using the single sample.
As shown in Figure 2, the data to ( ) f X X X none of the coefficients are within the supposed.
We observed than for X 1 (sense positive of life) for differences in β are out of interval to be credible, indicat- ing than ( ) ~, , X f X X X , cannot be a linear combination of the other dimensions.
In Figure 4, the outcomes for dataset ( ) where X 2 (satisfaction with life) 2 3 0.301    HDI goes from 0.209 to 0.391.Therefore X 1 and X 3 are equally costly and if we want to avoid double cost of measuring both, then, is probable to be more effective assess X 2 than X 3 .
In Figure 5 the outcomes for dataset ( ) ~, , X f X X X , where X 3 (personal fulfillment) 2 3 0.238 HDI goes from 0.149 to 0.325.Therefore X 1 and X 2 are equally costly and if we want to avoid double cost of measuring both, then, is probable to be more effective assess X 1 than X 4 .
Figure 5 shows for dataset ( ) ~, , X f X X X , where X 4 (joy of living) 1 2 0.106 β β − = , HDI goes from 0.0205 to 0.191.Therefore X 1 and X 2 are equally costly, and if we want to avoid double cost of measuring both, then, is probable to be more effective assess X 1 than X 2 .The template is used to format your paper and style the text.All margins, column widths, line spaces, and text fonts are prescribed; please do not alter them.You may note peculiarities.For example, the head margin in this template measures proportionately more than is customary.This measurement and others are deliberate, using specifications that anticipate your paper as one part of the entire journals, and not as an independent document.Please do not revise any of the current designations.

Conclusions
Bayesian methods have been developed as a tool for reasoning quantitatively in situations where arguments cannot be made with certainty.The focus recent developments of Markov chain Monte Carlo algorithms, in many situations the only way to integrate over the parameter space.The use of posterior predictive distributions makes the method robust to the choice of priors on the model parameters and enables the use of improper priors even when only very few observations are available.To measure the agreement between posterior predictive distributions, we derive a measure which has an intuitive probabilistic interpretation.
Within the discussion of the results, we found that the sample has not atypical behavior, too, and that suitable modifications can be described through a test.Another interesting result obtained is that the predictive probability for the case of X 1 (sense positive of life) and X 3 (personal fulfillment) dimensions exhibit a non-uniform variation, while other factors are uniform distribute.
The hypotheses of work, was that if through sample analysis could infer that happiness, only one is affected by three dimensions X 2 (sense positive of life), X 3 (personal fulfillment) and X 4 (satisfaction with life).In this context, we note that the hypothesis was tested, the marked tendency on distributions in recent factors was sufficient to support this theory, on the other hand, through the Multilinear Regression Bayesian, also tested this hypothesis.
Due to recent revolutionary advances in Bayesian posterior computation via computer-intensive MCMC simulation techniques, difficulties with posterior computations can be overcome.A Bayesian state-space model is readily implemented using standard Bayesian software such as JAGS, BUGS, NIMBLE and STAN.One can therefore avoid writing one-off programs in a low-level language.Any modifications, such as different prior distributions, applications to different data sets, or the use of different sampling distributions, require the change of just a single line in the code.

[ 30 ]
-[33] functions associated with the samples and ( ) k π denote a prior density on Θ over the model on k θ given the model k M .The posterior probability for k M and k NIG (Normal-Inverse-Gamma) and it is given by ( ) b B n S they are hyperparameters.
Θ .In the multiple comparisons problem, the class of candidate models consists of all possible mean level clustering.Each candidate model is parameterized by the mean vector n Y denote the observed data.Assume that n Y is to be described using a model k M selected from a set of candidate models { } 1 , , l M M  .Assume that each k M is uniquely parameterized by k θ , an element of the parameter space ( ) k