Abstract
Background/Aims The Virtual Data Warehouse (VDW) was created as a mechanism to produce comparable data across sites for purposes of proposing and conducting research. The VDW is not a multi-site physical database at a centralized location, but a distributed ‘virtual’ database with the data remaining at the local sites. At the core of the VDW are a series of standardized file definitions. Content areas and data elements that are commonly required for research studies are identified, and data dictionaries are created for each of the content areas, specifying a common format for each of the elements - variable name, extended definition, code values. Local site programmers have mapped the data elements from their legacy data systems into this standardized set of variable definitions, names, and codes, as well as onto standardized SAS file formats. This common structure of the VDW files enables a SAS analyst at one site to write one program to extract and/or analyze data at all participating sites.
Methods This poster demonstrates the data sources used at Essentia Health (EH) for our local implementation of the VDW.
Results EH local implementation of the VDW contains detailed medical information on EH patients. These files contain details on 18.5 million unique medical encounters (2002–2012), 25.9 million diagnoses, and 37.7 million procedures. The VDW Enrollment file, which was created to define patient population engaged with EH, has 390,000 “enrollment” periods for 378,000 unique patients. The Demographics file has 2.4 million records.
Conclusions The EH VDW provides an easily employable unified central repository of data from available source files. This resource enables the sharing of compatible data in multi-site studies, and also improves programming efficiency, accuracy, and completeness for local single site studies by expending resources to link these legacy systems only once.

