Deans' stroke musings: Multiple imputation integrated to machine learning: predicting post-stroke recovery of ambulation after intensive inpatient rehabilitation

Changing stroke rehab and research worldwide now.Time is Brain! trillions and trillions of neurons that DIE each day because there are NO effective hyperacute therapies besides tPA(only 12% effective). I have 523 posts on hyperacute therapy, enough for researchers to spend decades proving them out. These are my personal ideas and blog on stroke rehabilitation and stroke research. Do not attempt any of these without checking with your medical provider. Unless you join me in agitating, when you need these therapies they won't be there.

What this blog is for:

My blog is not to help survivors recover, it is to have the 10 million yearly stroke survivors light fires underneath their doctors, stroke hospitals and stroke researchers to get stroke solved. 100% recovery. The stroke medical world is completely failing at that goal, they don't even have it as a goal. Shortly after getting out of the hospital and getting NO information on the process or protocols of stroke rehabilitation and recovery I started searching on the internet and found that no other survivor received useful information. This is an attempt to cover all stroke rehabilitation information that should be readily available to survivors so they can talk with informed knowledge to their medical staff. It lays out what needs to be done to get stroke survivors closer to 100% recovery. It's quite disgusting that this information is not available from every stroke association and doctors group.

Thursday, October 24, 2024

Multiple imputation integrated to machine learning: predicting post-stroke recovery of ambulation after intensive inpatient rehabilitation

Predicting failure to recover is ABSOLUTELY USELESS TO SURVIVORS! Have you never talked to a survivor? When the fuck will you do what survivors want? Maybe when you are the 1 in 4 per WHO that has a stroke?

(I'd have everyone fired who works on predicting recovery rather than delivering recovery!)

Multiple imputation integrated to machine learning: predicting post-stroke recovery of ambulation after intensive inpatient rehabilitation

Scientific Reports volume 14, Article number: 25188 (2024) Cite this article

Metrics details

Abstract

Good data quality is vital for personalising plans in rehabilitation. Machine learning (ML) improves prognostics but integrating it with Multiple Imputation (MImp) for dealing missingness is an unexplored field. This work aims to provide post-stroke ambulation prognosis, integrating MImp with ML, and identify the prognostic influential factors. Stroke survivors in intensive rehabilitation were enrolled. Data on demographics, events, clinical, physiotherapy, and psycho-social assessment were collected. An independent ambulation at discharge, using the Functional Ambulation Category scale, was the outcome. After handling missingness using MImp, ML models were optimised, cross-validated, and tested. Interpretability techniques analysed predictor contributions. Pre-MImp, the dataset included 54.1% women, 79.2% ischaemic patients, median age 80.0 (interquartile range: 15.0). Post-MImp, 368 non-ambulatory patients on 10 imputed datasets were used for training, 80 for testing. The random forest (the validation best-performing algorithm) obtained 75.5% aggregated balanced accuracy on the test set. The main predictors included modified Barthel index, Fugl-Meyer assessment/motricity index, short physical performance battery, age, Charlson comorbidity index/cumulative illness rating scale, and trunk control test. This is among the first studies applying ML, together with MImp, to predict ambulation recovery in post-stroke rehabilitation. This pipeline reliably exploits the potential of incomplete datasets for healthcare prognosis, identifying relevant predictors.

Introduction

Personalised medicine represents a new frontier in healthcare. Data-driven approaches are crucial in optimising individualised rehabilitation pathways by providing reliable, interpretable, and patient-centric predictions¹. Moreover, there is a pressing demand for trustworthy prognostic solutions, enabling users to understand and interpret automatic decisions². However, while tools for personalised treatment decisions are becoming more prevalent in healthcare, their clinical validation and impact on treatment improvement remain largely underexplored³.

Treatment personalisation is particularly relevant in rehabilitative medicine⁴, where the goal is to adapt the rehabilitation plan to the unique needs of each patient, given the evidence of its positive effects on recovery⁵. According to Kokkotis et al.⁶, machine learning (ML) tools can be applied to predict long-term recovery rates from the earliest hours of hospitalisation after a stroke. This suggests ML can assist medical practitioners in deploying novel, individualised rehabilitation approaches, to enhance the quality of life for survivors and the overall quality of care. However, examples of technological tools that support personalised post-stroke rehabilitation treatments are still scarce⁷.

The use of ML technologies in healthcare presents pitfalls such as prediction inaccuracy, privacy vulnerabilities, and data scarcity that can hinder the attainment of real-life comparable results⁸. A critical challenge is collecting high-dimension and high-quality data for reliable and reproducible predictions, due to limitations in sample size and data quality in real-world scenarios⁸. In this context, the presence of missing data may represent a significant technical problem⁹, because it can result in a loss of information, reduced sample size, bias in the results, and underestimated uncertainty^9,10,11. When it is not possible to avoid missing values by optimising data collection, Multiple Imputation (MImp) is a suitable method for obtaining unbiased results while appropriately considering variability^11,12. While in single imputation only one value is imputed for each missing entry, causing statistical analyses to overlook the uncertainty around the values which are not observed¹⁰, MImp is a statistical technique involving the generation of multiple plausible estimates for missing values, allowing a correct quantification of the uncertainty associated with missing observations in the data¹³.

In ML, the presence of missing data is often resolved by simply removing or exclusively filling the entries with a single imputation procedure¹³. However, the integration of MImp techniques with ML methods is possible, despite being rarely addressed, and may lead to superior results, enhancing prediction performance¹⁴. Pioneering contributions currently exist specifically addressing the use of MImp techniques in ML, exploring alternative procedures and their feasibility^15,16. Rios et al.¹⁵ conducted an evaluation of the impact of missing values on the accuracy estimates of ML models, employing seven distinct methods for missing data management, such as the MImp method, cluster-based imputation or regression-based imputation. In this work, MImp emerged as a promising compromise between feasibility and accuracy, in predicting patient-specific risk of adverse cardiac events.

Despite the increasing prevalence of ML methods applied to stroke and ambulation recovery studies¹⁷, in accordance with current information no attention has been given to integrating advanced missing data management techniques with ML ones. Therefore, it becomes urgent to explore and evaluate methods that ensure the robustness and reliability of missing data handling without compromising the overall effectiveness of the analytical process.

This study focuses on the development of predictive models for the prognosis of stroke rehabilitation outcomes, based on the datasets of two multisite observational studies, prospectively and systematically enrolling all adults addressing intensive inpatient rehabilitation within 30 days after stroke^18,19. The recovery of independent ambulation is a key stroke rehabilitation outcome, directly related to community mobility and participation²⁰, and improved quality of life in the chronic stage of stroke, as well as a determinant of caregiver’s burden²¹. Further independent walking is a well-known top priority of stroke patients and their families, having a relevant impact on the patients’ social destination after discharge, and mobility. For these reasons, the recovery of independent ambulation can be considered one of the most relevant patient-centred outcomes, as also reported in the International Standard Set of Patient-Centered Outcome Measures After Stroke²². Thus, we focused on the recovery of independent ambulation at discharge from rehabilitation in the subset of stroke survivors, who ambulated independently before stroke but lost the ability after stroke. After an accurate phase of data pre-processing, this study integrated MImp techniques with a cross-validated ML-based predictive model. Then, influential predictors of ambulation outcomes were identified, by using explainable Artificial Intelligence (AI) techniques.