Changing stroke rehab and research worldwide now.Time is Brain! trillions and trillions of neurons that DIE each day because there are NO effective hyperacute therapies besides tPA(only 12% effective). I have 523 posts on hyperacute therapy, enough for researchers to spend decades proving them out. These are my personal ideas and blog on stroke rehabilitation and stroke research. Do not attempt any of these without checking with your medical provider. Unless you join me in agitating, when you need these therapies they won't be there.

What this blog is for:

My blog is not to help survivors recover, it is to have the 10 million yearly stroke survivors light fires underneath their doctors, stroke hospitals and stroke researchers to get stroke solved. 100% recovery. The stroke medical world is completely failing at that goal, they don't even have it as a goal. Shortly after getting out of the hospital and getting NO information on the process or protocols of stroke rehabilitation and recovery I started searching on the internet and found that no other survivor received useful information. This is an attempt to cover all stroke rehabilitation information that should be readily available to survivors so they can talk with informed knowledge to their medical staff. It lays out what needs to be done to get stroke survivors closer to 100% recovery. It's quite disgusting that this information is not available from every stroke association and doctors group.

Friday, March 20, 2026

Development of a Gait Independence Prediction Model in Patients With Stroke in a Convalescent Rehabilitation Ward: A Comparison of Decision Tree and Random Forest Models

 

Predictions are invariably useless unless they direct you to EXACT PROTOCOLS that fix the problem! This did nothing towards that!

Development of a Gait Independence Prediction Model in Patients With Stroke in a Convalescent Rehabilitation Ward: A Comparison of Decision Tree and Random Forest Models

Shogo Nakao • Tsuyoshi Motokawa • Takashi Nakamori

Published: March 19, 2026

DOI: 10.7759/cureus.105532 

Open Access
Peer-Reviewed
Cite this article as: Nakao S, Motokawa T, Nakamori T (March 19, 2026) Development of a Gait Independence Prediction Model in Patients With Stroke in a Convalescent Rehabilitation Ward: A Comparison of Decision Tree and Random Forest Models. Cureus 18(3): e105532. doi:10.7759/cureus.105532

Abstract

Aim: In this study, we aimed to examine the clinical utility of a classification and regression tree (CART) model for predicting independent ambulation at discharge based on physical function at admission to a convalescent rehabilitation ward, by comparing its performance with that of a random forest (RF) model. Seventy-three patients with stroke admitted to a convalescent rehabilitation ward were included.

Methods: Independent ambulation at discharge was defined using the Functional Independence Measure (FIM) locomotion item (walk/wheelchair): patients with a score ≥6 and ambulation as the primary mode of mobility were classified as independent, whereas those with a score <6 or wheelchair use as the primary mode of mobility were classified as nonindependent. The dataset was randomly divided into training (70%) and validation (30%) sets, and CART and RF models were developed using the training data and evaluated using the validation data.

Results: In the CART model, patients with a Trunk Impairment Scale (TIS) score <9 were classified as gait-independent when the FIM cognitive score was ≥30.5. Among patients with a TIS score ≥9, those aged <76.5 years were classified as independent, whereas those aged ≥76.5 years were classified as independent when the FIM cognitive score was ≥22.5. The area under the receiver operating characteristic curve was 0.832 and 0.856 for the CART and RF models, respectively, with no significant difference between the two models according to the DeLong test (p = 0.58).

Conclusion: These findings suggest that the CART model demonstrates discriminative ability comparable to that of the RF model and can hierarchically visualize the likelihood of gait independence based on admission assessments, thereby supporting clinical decision-making and intervention planning in convalescent rehabilitation wards.

Introduction

Gait impairment is one of the most common functional deficits after stroke, affecting approximately 80% of patients with stroke [1]. Achieving independent walking is particularly important for stroke survivors in terms of independence, safety, and efficiency, as it contributes to quality of life and long-term health outcomes [2]. However, approximately one-quarter of patients with stroke fail to regain independent ambulation by three months after onset [3]. Therefore, in postacute inpatient rehabilitation wards (referred to as convalescent rehabilitation wards in Japan), accurately predicting the likelihood of gait independence at discharge from an early stage after admission is clinically meaningful, as it can inform goal setting, intervention prioritization, and discharge planning.

To date, logistic regression analysis has been widely used to predict independence at discharge [4]. Although logistic regression is a standard and robust analytical method, gait independence after stroke is influenced by multiple factors, including age, motor impairment, trunk function, and cognitive function [5,6], and interactions and threshold effects among these factors may exist. Given such complex relationships, conventional regression models may be insufficient to fully capture the mechanisms underlying functional recovery, highlighting the need to explore alternative analytical approaches [7].

In recent years, machine learning techniques have gained attention as methods to address these limitations [8]. Among them, decision tree analysis is characterized by its ability to present results in a tree structure, allowing for intuitive visual interpretation. Because selected explanatory variables are hierarchically arranged, relationships among factors can be clearly identified, facilitating clinical interpretation and practical application [9]. However, single decision tree models are known to be unstable, as their branching structures are highly dependent on the training data, which may lead to reduced predictive accuracy [10]. To overcome this instability, random forest (RF), an ensemble method that aggregates multiple decision trees, has been proposed [11]. Therefore, comparing decision tree analysis, which offers high interpretability, with RF, which is expected to provide superior predictive performance among machine learning methods, is meaningful for evaluating whether decision tree models achieve acceptable performance for practical clinical use.

The purpose of this study was to develop a decision tree model to predict gait independence at discharge based on physical function at admission to a convalescent rehabilitation ward and to examine the clinical utility of decision tree analysis by comparing its performance with that of a more accurate RF model.

Materials & Methods

Participants

Participants were 73 patients with stroke (age, 71.0 ± 17.5 years) selected from 249 patients who were admitted to the convalescent rehabilitation ward of our hospital in Japan between August 2023 and March 2025 and who did not meet the exclusion criteria. The exclusion criteria were as follows: 1) not independently ambulatory before admission, 2) already independently ambulatory at admission, 3) transfer to another hospital, 4) in-hospital death, and 5) missing data.

Data collection and measures

Baseline characteristics and physical function measures were retrospectively reviewed from electronic medical records. Baseline characteristics at admission to the convalescent rehabilitation ward included age, sex, stroke type (cerebral infarction or intracerebral hemorrhage), lesion side (right or left), and time since stroke onset. Physical function measures included the lower extremity motor score of the Fugl-Meyer Assessment (FMA) [12]. Trunk function was assessed using the Trunk Impairment Scale (TIS) [13], balance function was assessed using the Berg Balance Scale (BBS) [14], and cognitive function was assessed using the cognitive items of the functional independence measure (FIM) [15]. Mobility at discharge was evaluated using the FIM locomotion item (walk/wheelchair) [15]. Admission assessments were conducted within two days of admission, and discharge assessments were conducted within two days before discharge. All clinical evaluations were performed by physical therapists working in the convalescent rehabilitation ward.

Statistical analysis

Gait independence at discharge was defined using the locomotion item of the FIM. Patients were classified as gait-independent if they were able to walk independently, with or without walking aids, and had an FIM locomotion score of 6 or higher. Patients who required supervision or physical assistance for walking were classified as nonindependent, even if they were able to ambulate using walking aids. Patients whose primary means of mobility was wheelchair use were also classified as nonindependent.

To ensure a balanced distribution of the outcome variable (independent vs. nonindependent), the dataset was randomly divided into training (70%) and validation (30%) sets using the cvpartition function in MATLAB based on the outcome variable. The random seed was fixed at 2025 to ensure reproducibility.

A decision tree model was developed using classification and regression tree (CART) analysis with the training dataset, and the Gini index was used as the splitting criterion. To prevent overfitting, 10-fold cross-validation was applied to the training dataset, and the optimal tree complexity was determined by pruning [9]. For each candidate pruning level, 10-fold cross-validation was performed once within the training dataset, and the pruning level with the minimum cross-validation loss was selected. The tree-growth hyperparameters were MinLeafSize = 1, MinParentSize = 10, and MaxNumSplits = n − 1, where n denotes the training sample size. In addition, an RF model was constructed using 500 trees. The number of variables considered at each split (mtry) was set to the square root of the total number of explanatory variables (√p) [16,17]. Classification was performed based on predicted probabilities, with a threshold of 0.5. Based on previous studies indicating that gait independence after stroke is influenced by factors such as age, motor impairment, trunk function, and cognitive function [5,6], we selected explanatory variables reflecting demographic characteristics, stroke-related clinical factors, and physical and cognitive function. The explanatory variables included age at admission to the convalescent rehabilitation ward, sex, stroke type, lesion side, days since stroke onset, the lower extremity motor score of the FMA, TIS, BBS, and the cognitive items of the FIM. Data analysis was performed using MATLAB R2023b (MathWorks, Natick, MA). Model performance was evaluated in the validation dataset by calculating the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and their 95% confidence intervals for both models. AUCs for CART and RF were compared using DeLong’s test, which was performed in R (version 4.1.2; R Core Team, Vienna, Austria, 2021). The statistical significance level was set at 5%.

Ethical considerations

This retrospective observational study was approved by the Ethics Committee of Okanami General Hospital (approval number: 0001). The purpose and methods of the study were disclosed through institutional postings, and participants were provided with the opportunity to opt out of the study.

Results

Participants

During the study period, 249 patients with stroke were admitted to the convalescent rehabilitation ward. Of these, 176 patients were excluded according to the exclusion criteria, and 73 patients were ultimately included in the analysis (Figure 1). The reasons for exclusion were as follows: five patients who were not independently ambulatory before admission, 12 patients who were already independently ambulatory at admission, three patients transferred to another hospital, two patients with in-hospital death, and 154 patients with missing data.

No comments:

Post a Comment