The integration of clinical and biomedical data hosted in multiple distributed repositories is confronted by two significant challenges: i) correctly linking information pertaining to the same patient across repositories, for example, linking lab results data with bedside observations data; and ii) making data available for analysis at different locations across a collaboration network. These problems are exacerbated in the case of rare diseases research, given the very limited availability of data sets and data standards.
We propose to develop the NCAT Global Repository for Rare Diseases Research (GRDR) based on BD2K PIC-SURE platform to address these challenges. NCAT GRDR repository will be a scalable, secure, and flexible integration architecture for clinical and biomedical datasets, which by extending the successful i2b2/tranSMART platform will allow data providers to easily share their data with the wider research community without requiring them to subscribe to proprietary vocabulary standards or to develop complex mapping protocols. Using federated data access and querying methods that retrieve relevant data from different locations before combining them, GRDR will make it possible for comparative analysis methods to be executed on the integrated datasets. By assigning generic identifiers (after de-identification) to related data across locations, GRDR will ease the difficulties of linking data while conforming to the requirements of patient data privacy and other security regulations.