Predicting remission after internet-delivered psychotherapy in patients with depression using machine learning and multi-modal data


BACKGROUND: Whether a patient benefits from psychotherapy or not is arguably a complex process and heterogeneous information extracted from process, genetic, demographic, and clinical data could contribute to the prediction of remission status after psychotherapy. This study applied supervised machine learning with such multi-modal baseline data to predict remission in patients with major depressive disorder (MDD) after completed psychotherapy. METHODS: Eight-hundred ninety-four genotyped adult patients (65.5% women, age range 18-75 years) diagnosed with MDD and treated with guided Internet-based Cognitive Behaviour Therapy (ICBT) at the Internet Psychiatry Clinic in Stockholm were included (2008-2016). Predictor variables from multiple domains were available: demographic, clinical, process (e.g. time to complete online questionnaires), and genetic (polygenic risk scores for depression, education and more). The outcome was remission status post ICBT (cut-off ≤10 on MADRS-S). Data were split into train (60%) and validation (40%) sets based on treatment start date. Predictor selection employed human domain knowledge followed by Recursive Feature Elimination. Model derivation was internally validated through repeated cross-validation resampling. The final random forest model was externally validated against a (i) null, (ii) logit, (iii) XGBoost, and (iv) blended meta-ensemble model on the hold-out validation set. Model transparency was explored through partial dependence and Local Interpretable Model-agnostic Explanations (LIME) analysis. RESULTS: Feature selection retained 45 predictors representing all four predictor types. With unseen validation data, the final random forest model proved reasonably accurate at classifying post ICBT remission (Accuracy 0.656 [0.604, 0.705], P vs null model = 0.004; AUC 0.687 [0.631, 0.743]), slightly better vs logit (bootstrap D =1.730, P = 0.084) but not vs XGBoost (D = 0.463, P = 0.643). Transparency analysis showed model usage of all predictor types at both the group and individual patient level. CONCLUSION: A new, multi-modal classifier for predicting MDD remission status after ICBT treatment in routine psychiatric care was derived and empirically validated. The multi-modal approach to predicting remission may inform tailored treatment, and deserves further investigation.

In medRxiv.