TITLE:
The Mean Treatment Effect Was Estimated Using a Machine-Learning Model: Evidence from the ECLS-K Dataset
AUTHORS:
Shenshuo Zhang
KEYWORDS:
Bayesian Additive Regression Trees (BARTs), Causal Inference, Early Childhood Education, Causal Machine Learning, Nonparametric Estimation
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.13 No.3,
August
28,
2025
ABSTRACT: This study investigates the persistent academic impacts of the Head Start program, a federal government-funded early childhood intervention, using data from the Early Childhood Longitudinal Study-Kindergarten Cohort (ECLS-K). Bayesian Additive Regression Trees (BARTs) are the primary methodology used, and average, conditional, and individual-level treatment impacts on children’s mathematics achievement are estimated. BART estimates a negative Average Treatment Effect (ATE) of −1.5421 with increasingly larger adverse effects for children with higher Socioeconomic Status (SES), suggesting diminishing marginal returns. This finding demonstrates the strength of BART to detect nonlinear moderation patterns that are evasive to conventional models. It also implies that Head Start and other preschool interventions will yield greater policy returns when targeted at low-SES children, in order to enable more efficient and fair distribution of public funds. For comparison, Causal Forest estimates a larger ATE (−2.4340) and determines SES to be the overarching moderator, while Propensity Score Matching offers a conservative estimate (−1.2606) without considering effect heterogeneity. These findings underscore the utility of BART in estimating subtle, SES-varying effects of Head Start, and suggest the potential value of more targeted intervention strategies guided by adaptive causal inference.