Abstract

Background: The Cancer Research Network (CRN) developed the Virtual Data Warehouse (VDW), a locally maintained dataset with consistent data content and structure across the CRN member sites. The VDW was used to conduct the multi-site study, ‘Is Stroke a Late Effect of Chemotherapy?’ The administrative and data extraction processes used to conduct the study are described.

Methods: Four CRN sites participated in this ‘data-only’ study: Henry Ford Health System (HFHS) in Detroit, MI, Group Health Cooperative (GHC) based in Seattle, WA, Kaiser Permanente Colorado (KPCO) in Denver, CO, and Kaiser Permanente Northern California, with overall efforts led by a non-CRN site, Wake Forest University in Winston-Salem, NC. Four sites contributed data, HFHS served as the programming site, and WFU served as the data coordinating and analysis center (DCC). Sites obtained Institutional Review Board (IRB) approval. The VDW tables accessed included Tumor Registry, HMO Enrollment, Diagnoses, and Pharmacy. The CRN study programmer wrote SAS programs for identification of cancer cases, covariates, and stroke outcomes. The VDW programs were beta tested by a second study site. After testing, the programs were provided to the remaining two sites for data retrieval. The programs transformed the data into an entirely de-identified dataset by calculating cancer diagnosis date and all other dates to ‘age in days’. The resulting data files were transferred to the DCC using the CRN secure file transfer protocol.

Results: Four different levels of IRB approval were necessary across the five sites, indicating local differences in interpretation of human subjects’ requirements. The VDW program development period at the lead programming site required two months, after which data were available. Beta testing took two days to complete. The third site, which ran as a ‘plug and play’ after beta testing worked out some additional issues requiring slight program modifications. The time from posting the final VDW program to upload of study data took less than one week for the two sites with mature VDW structures. The fourth site delivered data within three months. Preliminary analyses were generated within one month.

Conclusions: Using the VDW provided efficiency for labor, cost, and time. Such studies increase the overall value of the VDW as the skills of users improves.

  • Received September 11, 2008.
Loading

Keywords