Background

Dr. Eliot Berson was a member of the Massachusetts Eye and Ear Hospital and served as the William F. Chatlos Professor of Ophthalmology at Harvard Medical School. His vital research into retinitis pigmentosa (RP) yielded many important findings, including nutrition based approaches to slowing or stopping the effects of RP. This data set consists of 3 Clinical Trials conducted under Dr. Berson that detail treatment regimines for individuals with RP.

Description of Clinical Trials Data

Trial 1 (1984-1991)

601 adults with typical RP were treated with either vitamin A palmitate 15,000IU per day on average or vitamin E 400 IU per day to assess impact on cone electroretinograms (ERGs), an established predictor of disease progression. Patients treated with vitamin A showed a significant (P=0.01) slowing of retinal decline at the 99% confidence limit. A subset of these patients (n=125) also showed a significant preservation of visual field as assessed by Goldmann visual fields assessment. Longitudinal follow up over the 4-6 years of the study indicated that a dosage of 18,000 IU of vitamin A per day (15,000 IU by supplement combined with 3,000 IU by diet) provided the least decline in cone ERG.

Berson EI, Rosner B, Sandberg MA, Hayes KC, Nicholson BW, Beigel DiFranco C, Willett W. A randomized trial of vitamin A and vitamin E supplementation for retinitis pigmentosa. Arch Ophthalmol. 111:761-772; 1993.

Trial 2 (1996-2001)

221 adults with typical RP were treated either with 1200 mg DHA (docosahexaenoic acid) capsules per day versus control capsules. Building on the previous study, all participants also received 15,000 IU of vitamin A palmitate per day. Although no benefit was observed with the combined vitamin A + DHA protocol, a subgroup of the controls revealed that those who ate 200 mg DHA per day through oily fish had a 40-50% slower loss of central visual field sensitivity on the Humphrey perimeter over four years as compared to those eating less than one serving of oily fish per week.

Berson EL, Rosner B, Sandberg MA, Weigel-DiFranco C, Moser A, Brockhurst RJ, Hayes KC, Johnson CA, Anderson EJ, Gaudio AR, Willett WC, Schaefer EJ. Clinical trial of docosahexaenoic acid in patients with retinitis pigmentosa receiving vitamin A treatment. Arch Ophthalmol. 122:1297-305; 2004.

Berson EL, Rosner B, Sandberg MA, Weigel-DiFranco C, Moser A, Brockhurst RJ, Hayes KC, Johnson CA, Anderson EBerson J, Gaudio AR, Willett WC, Schaefer EJ. Further evaluation of docosahexaenoic acid in patients with retinitis pigmentosa receiving vitamin A treatment: subgroup analyses. Arch Ophthalmol. 122:1306-14; 2004.

Trial 3 (2003-2008)

225 adults with typical RP were given either 12 mg lutein per day or a control tablet, all in addition to 15,000 IU of vitamin A palmitate per day. The lutein treatment group slowed loss of mid-peripheral visual field sensitivity (total point score) but did not preserve central field sensitivity.

Berson EL, Rosner B, Sandberg MA, Weigel-DiFranco C, Brockhurst RJ, Hayes KC, Johnson EJ, Anderson EJ, Johnson CA, Gaudio AR, Willett WC, Schaefer EJ. Clinical trial of lutein in patients with retinitis pigmentosa receiving vitamin A. Arch Ophthalmol. 128:403-11; 2010.

Berson EL, Rosner B, Sandberg MA, Weigel-Difranco C, Willett WC. Omega-3 Intake and visual acuity in patients with retinitis pigmentosa receiving vitamin A. Arch Ophthalmol. 130:707-11; 2012.

The National Health and Nutrition Examination Survey (NHANES) is a population survey implemented by the Centers for Disease Control and Prevention (CDC) to monitor the health of the United States whose data is publicly available in hundreds of files. This Data Descriptor describes a single unified and universally accessible data file, merging across 255 separate files and stitching data across 4 surveys, encompassing 41,474 individuals and 1,191 variables. The variables consist of phenotype and environmental exposure information on each individual, specifically

  • Demographic information
  • Physical exam results (e.g., height, body mass index)
  • Laboratory results (e.g., cholesterol, glucose, and environmental exposures)
  • Questionnaire items
Second, the data descriptor describes a dictionary to enable analysts find variables by category and human-readable description.

The datasets are available on DataDryad and a hands-on analytics tutorial is available on GitHub. Through a new big data platform, BD2K Patient Centered Information Commons (http://pic-sure.org), we provide a new way to browse the dataset via a web browser and provide application programming interface for programmatic access.


The integration of clinical and biomedical data hosted in multiple distributed repositories is confronted by two significant challenges: i) correctly linking information pertaining to the same patient across repositories, for example, linking lab results data with bedside observations data; and ii) making data available for analysis at different locations across a collaboration network. These problems are exacerbated in the case of rare diseases research, given the very limited availability of data sets and data standards.

We propose to develop the NCAT Global Repository for Rare Diseases Research (GRDR) based on BD2K PIC-SURE platform to address these challenges. NCAT GRDR repository will be a scalable, secure, and flexible integration architecture for clinical and biomedical datasets, which by extending the successful i2b2/tranSMART platform will allow data providers to easily share their data with the wider research community without requiring them to subscribe to proprietary vocabulary standards or to develop complex mapping protocols. Using federated data access and querying methods that retrieve relevant data from different locations before combining them, GRDR will make it possible for comparative analysis methods to be executed on the integrated datasets. By assigning generic identifiers (after de-identification) to related data across locations, GRDR will ease the difficulties of linking data while conforming to the requirements of patient data privacy and other security regulations.


The Exposome Data Warehouse (EDW) is a unified database of environmental information that enables quick data linkage between geolocated environmental information and individual-level data (ie, from electronic health records). Currently, EDW contains EPA air data, NOAA weather data, and American Community Survey socioeconomic and demographic data.

Please visit https://github.com/hms-dbmi/exposomeDW_public for tutorials and other EDW-related information.