Abstract
Background In many epidemiologic studies smoking and alcohol are often examined as an exposure variable when focusing on health conditions associated with these lifestyle factors. Furthermore, smoking and alcohol have almost become standard confounding variables in epidemiologic studies when examining other exposures. We highlight the challenges associated with cleaning and analyzing the smoking and alcohol variables within various sources of administrative data when conducting pregnancy studies.
Methods We compared information on smoking and alcohol use among pregnant women from the VDW vitals table, social history table, and birth certificate data from Georgia State. Encounter data for smoking from the social history table was cleaned based the contact dates and quit dates falling within the pregnancy period and various combinations of ‘yes’, ‘no’, and ‘quit’ were examined based on these dates. Categories of smoking were created and compared to data in the VDW and on birth certificates. Similarly, we examined information on alcohol use during pregnancy from the social history table, including information reported in the ‘alcohol comment’ field.
Results (more detailed results will be available at the time of the conference):We found inconsistencies in how smoking during pregnancy was reported in birth certificate data versus the social history table. However, this inconsistency was only among smokers and former smokers, whereas information for non-smokers was consistent across the different sources. Within the social history table there were many pregnant women that had ‘yes’ for the ‘drink alcohol’ field, however many of these women had information in the ‘alcohol comment’ field stating that they did not drink during pregnancy. After creating levels of alcohol use during pregnancy we found inconsistencies with information on the birth certificate.
Conclusions While it is important to assess smoking and alcohol during pregnancy for the purposes of either a confounding variable or an exposure variable. We highlighted how these two variables (one from the VDW) are inconsistent across various data sources and both variables must be used with caution. NOTE: In order to present these data we feel a presentation is more suitable than a poster.




