TITLE:
A Hybrid CNN-LSTM Variational Autoencoder for Treatment Response Prediction in Synthetic Psychiatric Data
AUTHORS:
Rocco de Filippis, Abdullah Al Foysal
KEYWORDS:
Deep Learning, CNN-LSTM, Variational Autoencoder, Treatment Response Prediction, Precision Psychiatry
JOURNAL NAME:
Open Access Library Journal,
Vol.13 No.1,
January
15,
2026
ABSTRACT: Treatment response prediction remains one of the most pressing challenges in precision psychiatry, where patient heterogeneity and complex biomarker interactions limit the reliability of conventional clinical and statistical models. To address this gap, we present a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), and a Variational Autoencoder (VAE) bottleneck for the classification of responder and non-responder groups using synthetic biomarker-inspired and clinical data. The architecture leverages CNN layers to capture localized temporal features, LSTM layers to model sequential dependencies, and the VAE to enforce probabilistic latent representations that improve robustness and generalization. The proposed model achieved consistently perfect performance across multiple evaluation metrics. Classification accuracy reached 100%, while both the area under the Receiver Operating Characteristic curve (AUC) and the average precision (AP) score were 1.0, confirming flawless discriminative ability. Probability estimates were ideally calibrated, yielding a Brier score of 0.000, and threshold-dependent analyses demonstrated stable precision, recall, and F1 scores across a wide range of cutoffs. These results underscore the ability of hybrid deep learning architectures to not only distinguish treatment response groups with high accuracy but also to provide interpretable probability outputs that could support clinical decision-making. Although the findings are based on synthetic data, this study offers a strong proof-of-concept for multimodal predictive modelling in psychiatry. Future work should focus on validating the framework with real-world multimodal datasets, incorporating attention-based interpretability, and adapting the model across diverse patient cohorts to move closer to clinically actionable treatment response prediction.