Does Education Affect Individual Well-Being? Some Italian Empirical Evidences

Using data from the last European Survey on Income and Living Conditions (EU-SILC), this paper focuses on the measurement of well-being and on its association with education. EU-SILC survey gives information on several aspects of people’s daily life (i.e. housing, labour, health, education, finance, material deprivation and possession of durables) allowing a multi-dimensional approach to the study of well-being, poverty and social exclusion. For our aims we have considered only survey data collected in Italy. Due to the multidimensionality of well-being concept, we have selected some variables related principally to four main dimensions of well-being, which are financial endowment, housing conditions and goods possessions, health status, and environment. A first explanatory analysis via multivariate regression model has highlighted the effect of education on the factors considered. Finally, a latent class regression analysis has been used to cluster individuals into mutually exclusive latent classes which identify different intensities of well-being (the latent trait) taking into account the effect of education in the membership probability of each latent class.


Introduction
Education, training and skills affect well-being and open up opportunities otherwise precluded. Education not only has a value by itself, but also affects the well-being of people in direct and indirect ways. Generally, more educated people earn more, have a higher standard of living, live longer as they have better lifestyles, more op-portunities to find a work in less risky environments, higher access and possession of goods and services and, in general, more active lifestyles. In order to assess the relationship between education and life conditions, it is necessary, preliminarily, to define and to measure the concept of wellbeing. It is a complex and multidimensional concept involving different aspects or dimensions of individuals' life. The complexity of well-being has led the empirical studies to shift attention from an overall measure of well-being to specific aspects of it (health, financial problems, etc.) considering, jointly, the main determinants of well-being dimensions (age, education, and so on). In this paper four dimensions of well-being have been selected and synthesized via item response theory models. Afterwards, the main determinants of well-being have been analyzed by using a multivariate regression analysis in order to shape individuals' characteristics which mainly influence well-being domains as well as to assess differences in the way they affect the four domains. The first findings of analysis show that more educated people have significant and relevant better standard of well-being in at least three of the four domains. This suggests us to further investigate the relationship between overall well-being and education using latent class regression analysis. This methodology allows us to identify homogenous clusters of individuals who share a common level of well-being and to assess how educational levels affect the latent class membership probability belonging to segments of populations characterized by different intensities of the underlying trait.

The Well-Being Concept: Quality of Life, Objective and Subjective Well-Being
For a long time among the economists well-being has been a concept analysed and measured by means of income. Later on it has been argued that income was only a partial and uncompleted measure of a wider concept as well-being is; therefore attempts were made in order to define measures alternative or complementary. The limits of income as indicator of well-being have been recognized in the economic literature as well as by policy makers [1]- [3] and consequently have been advanced some approaches to measure well-being, recognizing its multidimensionality and trying to identify its dimensions [4] [5].
In the last two decades the concept of well-being and its measurement have became a relevant and actual topic in the social, economic and psychological literature, that led to a proliferation of theoretical and empirical articles on this issue. The domains-of-life literature has allowed the introduction of a multidimensional approach to well-being, accounting for different aspects of individuals' life as material and socio-environmental aspects, personal perceptions, aspirations and feelings [6] [7]. According to this stream of literature, the concept of well-being has been developed with specific attention to social and non-economic aspects. The analysis of well-being need to distinguish the individual-level and the population-level [8], and to assess it from objective and/or subjective perspectives [9]- [11]. Objective well-being can be assessed through economic and social indicators such as health or safety, while subjective measures require an individual judgement on the different domains of daily life [12] [13].
The OECD encourages reflection on well-being through various projects and initiatives, aimed at reach a better measure of well-being and to provide a stronger evidence for policy makers in order to obtain "better policies for better life". To this aim, the OECD has proposed a selection of suitable indicators for comparing some dimensions of well-being in developed and selected emerging economies, but do not attempt the final step of aggregating the data into a composite measure of well-being or to classify individuals into well-being classes. In Italy, the National Institute of Statistics (ISTAT) and the National Council for Economy and Labour (CNEL) promote a research program aimed at the construction of appropriate indicators of well-being.

Education and Well-Being
Empirical literature shows that well-being is largely related to geographical characteristics and socio-demographic factors. Among socio-demographic factors, the empirical studies include variables such as age, occupation, level of education, marital status and the size and composition of a family [14] [15]. It is common in socioeconomic studies to explore determinants of well-being by including in the analyses factors like age, gender, socio-economic status, geographic location and education [16] [17]. The effects of higher levels of education on different facets of an individual's life have long been discussed in the literature [18]. Educational achievement up to secondary level has a positive impact on well-being [19]- [21]. The tertiary-educated people have better health, better jobs, most often higher earnings, better wealth and wider social networks than their counterparts with lower levels of education. Therefore the tertiary-educated should achieve higher utility, or overall wellbeing. Education is traditionally positively correlated with features of objective well-being such as employment opportunities, income, wealth, and health [22]- [26]. Studies focusing on educational achievements and wellbeing, underline a lack of coherent theories linking the two concepts, as well as the need for further empirical and theoretical investigation of the possible interconnections between them [27]. Empirical evidences have demonstrated that individuals with higher level of education have a higher level of well-being and more chances to have a job [28] [29], they have a longer life, better lifestyles and higher participation to social and cultural activities [30] [31].
In the following we first describe the methodologies used to: 1) operationalize the domains of well-being, 2) measure the way individuals' characteristics influence well-being and assess differences in their effects across domains and 3) clustering individuals in mutually exclusive latent classes which identify different segments of the latent trait conditional upon educational level. The proposed methods are adopted on the UE-SILC for Italy in order to shade light on links across well-being domains and to investigate the strength of the overall relationship with educational levels.

Methodology
In the following subsections we briefly introduce the used methodologies in order to reach our operative aims.

IRT and Individual Domains of Life
The Item Response Theory approach (IRT), [32]- [34] has been here adopted in order to select in EU-SILC database the variables more useful to operationalize individuals' position on the four dimensions of well-being above listed. Items with a negative directionality with respect to the latent trait have been reversed. Thus, higher categories of a response denote higher level of the underlying latent trait. The scale composed just by dichotomous items have been scaled using the two-parameter logistic model, which specifies the logit of the probability that that individual i scores positively item j, as function of a subjective parameter "person parameter" (η i ), and two objective item parameters, specifically an "item difficulty parameter" () and an "item discrimination parameter" (): specifies the point along the continuum latent trait at which the probability to score a certain category (i.e., "yes") is equal 0.5 and the discrimination parameter () provides information on the steepness of the logistic functions. The person parameter is a random term with a normal distribution. Individuals' score on the latent trait, expressed by the person parameters are measured on the same metric of item parameters. All aspects measured by scales composed by mixed dichotomous or/and ordinal polytomous have been scaled using the graded response model where for (k = 1, •••, K) are item-category parameters (or threshold parameters) which signal the minimum level of well-being required in order to endorse the category k of item j. The GRM allows handling scales in which items have a different number of categories. The ltm package for R language has been used to fit an unidimensional IRT model for each dimension of well-being [33]. Person parameters on the four latent traits have been estimated using an Empirical Bayes estimates.

Multivariate Regression Analysis
Relationships across the estimates of person parameters on the four domains and covariates have been investigated in a first explorative phase via Multivariate Regression Analysis (MRA). The modeling approach has been specified to jointly assess the effect of individuals and families covariates on divergences in the four dimensions of the well-being here considered. By indicating with 1, , the score on each dimension of well-being (summarized through an IRT scoring) of individual I, the joint density function as been modeled as follows: where d indicates to which dimension the indicator of well-being refers to. ( ) d i ε is a multivariate normal vector of random terms which take into account the within individuals variability in the four dimensions, and x a z are two vectors of individual (x) and family (z) covariates. The model has been estimated with STATA using the runMLwiN routine implemented by Leckie and Charlton [35], by specifying as estimation method the Iterative Generalized Least Squares. The multivariate approach allows us to assess the degree of correlation between individual well-being on the four dimensions and the relative effect of covariates. Nonetheless, the main advantage of using a multivariate model rather than a separate model for each dimension is that the all information available on individuals is explicitly considered in estimation of regression parameters and more accurate estimates are obtained by relying on the maximization of the joint likelihood function.

Latent Class Regression Analysis
In a second step, on the basis of the main results arose from the explorative analysis, the relationship between the overall well-being and education has been investigated using LCRA [36] [37] is a finite mixture model designed to identify a number of categorical classes of a latent variable moving from individual responses to a set of indicator variables considering, jointly, the effect of covariates on the probability to belong to a certain latent class. The latent variable is assumed being categorical and the original data are segmented into R exclusive and all-embracing subsets: the latent classes. Observations are first grouped into classes and then, different regressions are estimated to assess the effect of some covariates on the probability of belonging to a specific latent class (latent class membership).
Consider J polytomous or dichotomous manifest observed over i = 1, •••, n individuals. Each manifest variable contains K j possible outcomes. Y ijk indicates a dummy variable which takes the value 1 if for individual i the where π rjk is the probability that for an individual in class r we observe the k-th outcome in the manifest variable j (i.e., the item-response probability conditional upon the latent class membership); p r (x) is the probability of being classified in the latent class r (latent class membership probability); p r (x) is a function of the vector x i of covariates through a multinomial logit link [38] [39] For each latent class, β r stands for the effect of covariates on the logit of belonging to class r rather than to the baseline class The probability that an individual with a particular pattern of manifest variables and with a specific set of covariate values (x i ) belongs to a specific class is given using Bayes' theorem.

Data
The European Union survey named Statistics on Income and Living Conditions (EU-SILC), started in 2003, is aimed at gathering comparable cross-sectional and longitudinal individual level data on income, poverty, social exclusion and living conditions in 27 participating countries (in 2011). Information on housing is gathered mainly at household level while education, labour, income and health information is collected only for individuals aged 16 and over. Data here considered refer to the Italian sample of the EU-SILC survey. The total sample size is over 40 thousands observations clustered in nearly 20 thousands families. For our research some eligibility criteria has been set: only individuals aged 25 and over have been retained for the analysis and only one person per-family has been randomly selected. In that way 18,845 records have been considered. More records have been also deleted due for missing observation in most of the variable analysed. The final sample size is then 17,783 observations.
The following socio-demographic covariates have been considered Sex, Region (residence), Age, Marital status (married, never married, separated, widowed), Family type (couple with no kids, couple with kids, single parent, etc.), Income (total family income in thousands Euros), Employment status (unemployed, retired, employed or self-employed, not in working status), Job (classification of occupations according to ISCO 2008), and Education (pre-primary, primary, lower secondary, upper secondary, tertiary). Moreover, 22 manifest variables (as gathered as questionnaire responses) have been analyzed. For ease of reading, we report here the broad categories in which the variables have been classified for the subsequent analysis (in brackets the variables considered). Variables are categorical dichotomous (yes/no) or ordinal polythomous.
The four dimensions are: Finance (5 variables: arrears on utility bills, capacity to afford paying for one week annual holiday, capacity to afford a meal with meat/fish every second day, ability to make ends meet, financial burden of the total housing cost), Housing and goods possession (10 variables: problems with dwelling maintenance, ability to keep home warm, total housing cost, possession of: telephone-computer-dishwasher-DVDplayer-satellite TV-Internet connection-camera), Health conditions (3 variables: general health, limitation in activities because of health problems, chronic illnesses) and Environment (4 variables: noise from neighbors/street, pollution, crime violence in the area, too dark dwelling).

The Indicators of Well-Being
In a first step all the items contained in the survey questionnaire related to one of the four dimensions of wellbeing have been considered in the analysis with categories as originally defined in the questionnaire sheet. On the basis of the information arose by looking at the IRT parameters and the item (and test) information functions relevant items and categories have been selected. Moreover the continuum variable "total housing cost" has been categorized in three intervals. The final set of items used to measure Finance dimension is composed by two dichotomous and three polythomous items. The Item Characteristic Curves (ICC) describe as the probability to answer to each category varies as the value of the latent trait increases (Figure 1). The most informative item is the capability to make ends meet (ENDMEETS) which has six response categories that clearly identify six intervals related to segments of population with different levels of the latent trait. The steepness of the discrimination parameter of the item signals that the probability to score a category rather than another change significantly as the average level of financial well-being increases. The low value of the threshold parameter of the "capacity to afford a meal with meat/fish every second day" (FOOD) helps us to better differentiate between different levels of negative finance well-being, whereas the value close to 0 of the threshold parameter related to "HOLIDAY" helps to differentiate between individuals in the negative and in the positive segments of the underlying latent trait. Further information to the scale is added also by the two items arrears on utility bills (ARRBILL) and financial burden of the total housing cost (HOUSEEXP) which have categories with extremes threshold parameters. These two items effectively allow differentiating between people located in the positive and negative tails of the distributions: e.g. the lowest item-category parameter of "ENDMEETS" is greater than the lowest item-category parameter of "ARRBILL".
Items of Finance domain cover satisfactory the whole segments of the latent trait: the total test information in the negative segment of the latent trait [between −4 and 0] is almost equal to the one contained in the positive side [between 0 and +4] and the total Information function in the two traits is about 98%. The same calibration procedure has been applied to select indicators and categories for scaling the sub-components "Housing and goods possession", "Health" and "Environment". Environment is the dimension which shows the greatest weakness in terms capacity of the items to provide information which allows differentiating between individuals located on higher levels of Environment related well-being. For all four dichotomous items related to environmental well-being domain is relatively easy to answer in the positive category, thus almost the item parameters are located in the negative side of the latent trait. As a result the indicator assumes for almost all individuals which are on the positive segment of the environmental well-being values close each other and close to 0. This consideration will motivate the choice to categorize with a quartile-based transformation the latent traits built up with IRT in analysis addressed to cluster individuals on mutually exclusive latent classes characterized by different intensity of well-being.

Well-Being and Individual Characteristics
Multivariate regression analysis allows to highlight the individual characteristics affecting the level of well-being as well as to measure how their effects vary across the well-being domains, considering as response variables the four latent traits measured via IRT model. Following the main empirical findings of literature, we specify a multivariate regression model considering as covariates age, marital status, education, family type and the occupational status of respondent, and as dependent variables the four indicators obtained by the IRT model. The main results are reported in Table 1.
Covariates have a strong effect on each dimension with some peculiarities. The level of EDUCATION has a positive effect except for ENVIRONMENT; AREA is significant only for the indicators of environment and health. MARITAL STATUS shows that married couples have a better level of well-being in terms of Finance and Housing. Regarding FAMILY TYPE, families with children have a lower well-being level in terms of their Finance endowment. There is a gender effect, as male reach a better level of well-being in all four dimensions. Finally, regarding the JOB STATUS employed or self-employed have a better level of well-being, while retired persons have lower levels of well-being in terms of Housing and Health.
It is interesting to highlight that even controlling for other key covariates (such as FAMILY TYPE and JOB STATUS), the confidence intervals of the parameters related to increasing levels of education do not overlap for two (Finance and Housing) out four of the domains of well-being considered, with an average advantage of people with the highest endowment of education of about 1.2 with respect to those with the lowest. Also with respect to the health domain the results shape a dichotomy between people with primary and lower levels of education with respect to those with have at least a lower secondary level. These results suggest to further investigate the relationships by categorizing the metrical indicators built up with IRT and adopting LCRA to assess the overall relationship between well-being and education.

A LCRA Analysis
Multivariate regression has pointed out that education affect the four dimensions of well-being. In order to identify groups of individuals that share a homogenous level of well-being, controlling for the level of education, we fit a LCRA. The manifest variables are three indicators built by IRT. To fit the model (as the adopted routine requires categorical inputs) the indicators related to Finance, Housing and Possessions have been have been categorized using a quartile-based transformation (the so built up new variables have the capital letter "K" as a prefix in their names: KFIN, KHOUSE and KHEALTH); and for the Environment dimension (due to the weakness of the IRT indicator) we include the four dichotomous variable concerning problems with dark in dwelling, crime, pollution or noise. In selecting the appropriate number of Latent Classes we have considered the Bayesian Information Criterion (BIC) that drastically declines from one to three classes and then begins to level off. Consequently, a three-latent classes has been selected. For each indicator, the p-value is less than 0.05, indicating that the null hypothesis stating that all of the effects associated with that indicator are zero would be rejected. Thus, for each indicator, knowledge of the response for that indicator contributes in a significant way towards the ability to discriminate between the clusters. The estimated item response probabilities conditional upon latent class memberships for the four classes model are depicted in Figure 2. Overall, in this 3-class solution, the largest class represents "good well-being" individuals and comprises 54% of the sample, cluster 2 contains 32% and the remaining 14% are in cluster 3. The conditional probabilities show the differences in response patterns that distinguish the clusters. Class 1 (LC1) clusters "good well-being" individuals, Class 2 (LC2) "poor well-being" individuals and Class 3 (LC3) "moderate well-being" individuals. Both individuals in LC1 and LC2 have a good feeling with the environment, but whereas individuals in LC1 have higher probability to be in the third or fourth quartiles of distributions of the domains of well-being, people in LC2 have higher probability to be in the first or second quartiles. These classes are opposite in terms of well-being. LC3 identifies people in intermediate position but who have a different feeling about the environment, i.e. they have a bad perception of their environment. Figure 3 shows the probabilities of latent class membership with respect to the level of education. The probability of membership for LC1 increasing as the educational level of individuals increases. The opposite occurs for LC2. Finally, for LC3 the probability of membership increases up to lower secondary educational level, and decreases after. This empirical evidence confirms the main empirical findings of previously researches: people more educated have more chances to reach higher levels of well-being, i.e. the probability of membership to LC1 increases and, conversely, the probability of membership to LC2 decreases.

Conclusions
Well-being is a multidimensional concept that requires considering different aspects of life. The analysis of  well-being requires to identify its dimensions and to verify if it is affected by individual socio-demographic characteristics. Literature has proven that, among the other as age, sex, and so on, education is a very important factor for the well-being improvement. In this paper, using the last EU-SILC data, well-being of Italian sample has been analysed and measured through four main indicators relating principally Financial, Housing and goods possession, Health and Environment. These dimensions are not exhaustive, and undoubtedly, other dimensions and other variables may better define and complete the measurement of well-being.
However, this analysis is a starting point that synthesizes some relevant facets of well-being, and at the same time it confirms for Italian people the relation between well-being and education. Indeed, the LCA has allowed defining three latent classes: good, moderate and poor, and the effect of education on the probability of mem-bership for each latent class. From the analysis, if the level of education increases, the probability of membership in "good well-being" latent class also increases, and on the contrary for the "poor well-being", higher levels of education improve individual well-being as the probability of membership increases only for people in "good well-being" latent classes (in line with the main evidence of literature). The three empirical analyses have allowed analysing the research problem from different points of views. Firstly, using IRT models, four main dimensions of well-being have been operationalized by summarizing individuals' responses to selected items of the questionnaire of the EUSILC survey. Afterwards a multivariate regression analysis has been carried out for highlighting factors which play relevant effects on well-being dimensions as well as to assess divergences of their size across the domains. Finally, the results of the two previous analyses showing a persistent effect of educational level on well-being domains even controlling for other relevant characteristics (such as FAMILY TYPE and JOB status) and the discrete distribution of at least two of the four indicators of the well-being domains motivate the use of a multivariate modelling approach for categorical indicators in order to better summarize the relationship between overall well-being and educational level. The LCRA has allowed identifying three main latent classes of well-being and assessing the effect of education on each latent class membership probability, pointing up that the higher is the level of education; the greater is the probability belonging to classes characterized by higher levels of well-being. This empirical evidence has particular relevance in Italy where, despite the improvements of the educational level of Italian people in the last decade, it is not yet able to offer all young people the possibility of adequate education. The delay respect to the European average and the strong regional divide are the main weaknesses of the Italian educational system. The level of education that Italian people are able to reach is related to the social origin, the socio-economic context and the territory. Improvement in wellbeing in Italy requires that even the educational system improves, removing its weakness especially for the reduction of the regional gap.