Sunday, September 25, 2016

Asking New Questions with Old Data: The Centralized Open-Access Rehabilitation Database for Stroke

With this amount of knowledge it should be damned easy to come up with a strategy to solve all the problems in stroke.
  • 1School of Kinesiology, Auburn University, Auburn, AL, USA
  • 2School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ, USA
  • 3Department of Health, Physical Education and Recreation, Utah State University, Logan, UT, USA
  • 4Department of Physical Therapy, University of British Columbia, Vancouver, BC, Canada
  • 5Program in Physical Therapy, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
  • 6Program in Occupational Therapy, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
  • 7Department of Neurology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
Background: This paper introduces a tool for streamlining data integration in rehabilitation science, the Centralized Open-Access Rehabilitation database for Stroke (SCOAR), which allows researchers to quickly visualize relationships among variables, efficiently share data, generate hypotheses, and enhance clinical trial design.
Methods: Bibliographic databases were searched according to inclusion criteria leaving 2,892 titles that were further screened to 514 manuscripts to be screened by full text, leaving 215 randomized controlled trials (RCTs) in the database (489 independent groups representing 12,847 patients). Demographic, methodological, and statistical data were extracted by independent coders and entered into SCOAR.
Results: Trial data came from 114 locations in 27 different countries and represented patients with a wide range of ages, 62 year [41; 85] [shown as median (range)] and at various stages of recovery following their stroke, 141 days [1; 3372]. There was considerable variation in the dose of therapy that patients received, 20 h [0; 221], over interventions of different durations, 28 days [10; 365]. There was also a lack of common data elements (CDEs) across trials, but this lack of CDEs was most pronounced for baseline assessments of patient impairment and severity of stroke.
Conclusion: Data integration across hundreds of RCTs allows clinicians and researchers to quickly visualize data from the history of the field and lays the foundation for making SCOAR a living database to which researchers can upload new data as trial results are published. SCOAR is a useful tool for clinicians and researchers that will facilitate data visualization, data sharing, the finding of relevant past studies, and the design of clinical trials by enabling more accurate and comprehensive power analyses. Furthermore, these data speak to the need for CDEs specific to stroke rehabilitation in randomized controlled trials.
PROSPERO 2014:CRD42014009010


The information architecture in rehabilitation science is poor (1). For example, randomized controlled trials (RCTs) are the basic “unit” of information that guide clinical practice. Yet when clinicians and scientists want to ask a very basic question of these data, they are published: (1) across a wide spectrum of journals and formats that often have limited access (e.g., payment required for access); (2) embedded potentially in text, tables, figures, or even supplemental materials; and (3) with very few common data elements (CDEs) reported across studies (2, 3). Thus, despite the tremendous time and financial burdens associated with even a single RCT, the resultant data lack a consistent structure. This lack of structure is an unnecessary barrier to integration in future scientific and clinical practice. Efforts to streamline data integration should increase the transparency and visibility of comprehensive bodies of evidence, rather than a single study, to better inform clinically relevant questions such as, “How do therapy outcomes change with increased time in therapy?” or “How variable are outcomes, historically, for specific parameters of therapy?”
We now introduce one such tool for streamlining data integration: the Centralized Open-Access Rehabilitation database for Stroke (SCOAR). In short, SCOAR is a central repository for summary statistics from RCTs. SCOAR currently contains data from a systematic review and extraction of papers from 1981 to early 2014 (described in detail below), but the goal of SCOAR is much bigger: to create a “living” database where new data can be added as clinical trials are completed. Imposing such an architecture (4) on clinical trial data would allow basic and clinical scientists to (1) quickly and easily visualize relationships among variables, (2) efficiently share data, (3) generate hypotheses based on noticeable patterns or even “gaps” in the current data, (4) search the current literature from the data up (rather than key-terms down), and (5) improve clinical trial design through more accurate and comprehensive power analyses.
Generally speaking, the goal of SCOAR is to improve the design of future clinical trials by giving researchers fast and easy access to the historical range of effect-sizes, based on thousands of stroke patients who received therapies of different types, different doses, at different times, and were measured on different outcomes. From our perspective, the effort associated with the design, implementation, and dissemination of randomized clinical trials deserves an information architecture that supports and increases their visibility. In the current paper, we (1) explain the systematic search and data extraction that led to the creation of SCOAR; (2) present summary statistics for the major variables in SCOAR, including the geographical reach, to understand how SCOAR data represent research in stroke rehabilitation; and (3) based on the lack of CDEs we find across many variables, we argue for a consistent set of CDEs in rehabilitation trials (CDEs to describe participants, methodology, and outcomes). SCOAR lays the foundation for an information architecture that captures some of the complex and multivariate nature of neurorehabilitation. Most importantly, this information architecture is scalable, making it easy to add new data as new trials are published.

