Predictions are invariably useless unless they direct you to EXACT PROTOCOLS that fix the problem! This did nothing towards that!
Development of a Gait Independence Prediction Model in Patients With Stroke in a Convalescent Rehabilitation Ward: A Comparison of Decision Tree and Random Forest Models
Abstract
Aim: In this study, we aimed to examine the clinical utility of a classification and regression tree (CART) model for predicting independent ambulation at discharge based on physical function at admission to a convalescent rehabilitation ward, by comparing its performance with that of a random forest (RF) model. Seventy-three patients with stroke admitted to a convalescent rehabilitation ward were included.
Methods: Independent ambulation at discharge was defined using the Functional Independence Measure (FIM) locomotion item (walk/wheelchair): patients with a score ≥6 and ambulation as the primary mode of mobility were classified as independent, whereas those with a score <6 or wheelchair use as the primary mode of mobility were classified as nonindependent. The dataset was randomly divided into training (70%) and validation (30%) sets, and CART and RF models were developed using the training data and evaluated using the validation data.
Results: In the CART model, patients with a Trunk Impairment Scale (TIS) score <9 were classified as gait-independent when the FIM cognitive score was ≥30.5. Among patients with a TIS score ≥9, those aged <76.5 years were classified as independent, whereas those aged ≥76.5 years were classified as independent when the FIM cognitive score was ≥22.5. The area under the receiver operating characteristic curve was 0.832 and 0.856 for the CART and RF models, respectively, with no significant difference between the two models according to the DeLong test (p = 0.58).
Conclusion: These findings suggest that the CART model demonstrates discriminative ability comparable to that of the RF model and can hierarchically visualize the likelihood of gait independence based on admission assessments, thereby supporting clinical decision-making and intervention planning in convalescent rehabilitation wards.
Introduction
Gait impairment is one of the most common functional deficits after stroke, affecting approximately 80% of patients with stroke [1]. Achieving independent walking is particularly important for stroke survivors in terms of independence, safety, and efficiency, as it contributes to quality of life and long-term health outcomes [2]. However, approximately one-quarter of patients with stroke fail to regain independent ambulation by three months after onset [3]. Therefore, in postacute inpatient rehabilitation wards (referred to as convalescent rehabilitation wards in Japan), accurately predicting the likelihood of gait independence at discharge from an early stage after admission is clinically meaningful, as it can inform goal setting, intervention prioritization, and discharge planning.
To date, logistic regression analysis has been widely used to predict independence at discharge [4]. Although logistic regression is a standard and robust analytical method, gait independence after stroke is influenced by multiple factors, including age, motor impairment, trunk function, and cognitive function [5,6], and interactions and threshold effects among these factors may exist. Given such complex relationships, conventional regression models may be insufficient to fully capture the mechanisms underlying functional recovery, highlighting the need to explore alternative analytical approaches [7].
In recent years, machine learning techniques have gained attention as methods to address these limitations [8]. Among them, decision tree analysis is characterized by its ability to present results in a tree structure, allowing for intuitive visual interpretation. Because selected explanatory variables are hierarchically arranged, relationships among factors can be clearly identified, facilitating clinical interpretation and practical application [9]. However, single decision tree models are known to be unstable, as their branching structures are highly dependent on the training data, which may lead to reduced predictive accuracy [10]. To overcome this instability, random forest (RF), an ensemble method that aggregates multiple decision trees, has been proposed [11]. Therefore, comparing decision tree analysis, which offers high interpretability, with RF, which is expected to provide superior predictive performance among machine learning methods, is meaningful for evaluating whether decision tree models achieve acceptable performance for practical clinical use.
The purpose of this study was to develop a decision tree model to predict gait independence at discharge based on physical function at admission to a convalescent rehabilitation ward and to examine the clinical utility of decision tree analysis by comparing its performance with that of a more accurate RF model.
Materials & Methods
Participants
Participants were 73 patients with stroke (age, 71.0 ± 17.5 years) selected from 249 patients who were admitted to the convalescent rehabilitation ward of our hospital in Japan between August 2023 and March 2025 and who did not meet the exclusion criteria. The exclusion criteria were as follows: 1) not independently ambulatory before admission, 2) already independently ambulatory at admission, 3) transfer to another hospital, 4) in-hospital death, and 5) missing data.
Data collection and measures
Baseline characteristics and physical function measures were retrospectively reviewed from electronic medical records. Baseline characteristics at admission to the convalescent rehabilitation ward included age, sex, stroke type (cerebral infarction or intracerebral hemorrhage), lesion side (right or left), and time since stroke onset. Physical function measures included the lower extremity motor score of the Fugl-Meyer Assessment (FMA) [12]. Trunk function was assessed using the Trunk Impairment Scale (TIS) [13], balance function was assessed using the Berg Balance Scale (BBS) [14], and cognitive function was assessed using the cognitive items of the functional independence measure (FIM) [15]. Mobility at discharge was evaluated using the FIM locomotion item (walk/wheelchair) [15]. Admission assessments were conducted within two days of admission, and discharge assessments were conducted within two days before discharge. All clinical evaluations were performed by physical therapists working in the convalescent rehabilitation ward.
Statistical analysis
Gait independence at discharge was defined using the locomotion item of the FIM. Patients were classified as gait-independent if they were able to walk independently, with or without walking aids, and had an FIM locomotion score of 6 or higher. Patients who required supervision or physical assistance for walking were classified as nonindependent, even if they were able to ambulate using walking aids. Patients whose primary means of mobility was wheelchair use were also classified as nonindependent.
To ensure a balanced distribution of the outcome variable (independent vs. nonindependent), the dataset was randomly divided into training (70%) and validation (30%) sets using the cvpartition function in MATLAB based on the outcome variable. The random seed was fixed at 2025 to ensure reproducibility.
A decision tree model was developed using classification and regression tree (CART) analysis with the training dataset, and the Gini index was used as the splitting criterion. To prevent overfitting, 10-fold cross-validation was applied to the training dataset, and the optimal tree complexity was determined by pruning [9]. For each candidate pruning level, 10-fold cross-validation was performed once within the training dataset, and the pruning level with the minimum cross-validation loss was selected. The tree-growth hyperparameters were MinLeafSize = 1, MinParentSize = 10, and MaxNumSplits = n − 1, where n denotes the training sample size. In addition, an RF model was constructed using 500 trees. The number of variables considered at each split (mtry) was set to the square root of the total number of explanatory variables (√p) [16,17]. Classification was performed based on predicted probabilities, with a threshold of 0.5. Based on previous studies indicating that gait independence after stroke is influenced by factors such as age, motor impairment, trunk function, and cognitive function [5,6], we selected explanatory variables reflecting demographic characteristics, stroke-related clinical factors, and physical and cognitive function. The explanatory variables included age at admission to the convalescent rehabilitation ward, sex, stroke type, lesion side, days since stroke onset, the lower extremity motor score of the FMA, TIS, BBS, and the cognitive items of the FIM. Data analysis was performed using MATLAB R2023b (MathWorks, Natick, MA). Model performance was evaluated in the validation dataset by calculating the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and their 95% confidence intervals for both models. AUCs for CART and RF were compared using DeLong’s test, which was performed in R (version 4.1.2; R Core Team, Vienna, Austria, 2021). The statistical significance level was set at 5%.
Ethical considerations
This retrospective observational study was approved by the Ethics Committee of Okanami General Hospital (approval number: 0001). The purpose and methods of the study were disclosed through institutional postings, and participants were provided with the opportunity to opt out of the study.
Results
Participants
During the study period, 249 patients with stroke were admitted to the convalescent rehabilitation ward. Of these, 176 patients were excluded according to the exclusion criteria, and 73 patients were ultimately included in the analysis (Figure 1). The reasons for exclusion were as follows: five patients who were not independently ambulatory before admission, 12 patients who were already independently ambulatory at admission, three patients transferred to another hospital, two patients with in-hospital death, and 154 patients with missing data.
No comments:
Post a Comment