In the fall of this year, a Research and Data Librarian at the NYU Health Sciences Library, Fred LaPolla, was brought in to help teach an Intensive Research Practicum for Primary Care Residents. Dr. Colleen Gillespie, the Director of the Division of Education Quality in the Institute for Innovations in Medical Education and an Associate Professor in the Department of Medicine, led the practicum and wanted residents to ask a question of a secondary dataset, analyze the data, present the results, and write up a draft of a manuscript in 10 days. Prior to the beginning of the practicum, LaPolla pointed Dr. Gillespie to the NYU Data Catalog, and she was able to contact Dr. Lorna Thorpe about the Harlem Health Advocacy Partners Data Set.
The Harlem Health Advocacy Partners (HHAP) dataset was collected in five public housing developments in Harlem, New York City, where the chronic disease burden is high. Two rounds of data collection were performed: first, a telephone survey of 1,633 individuals and second, an interventional study of 370 individuals.The variables through these two rounds of data collection included age, gender, race/ethnicity, employment status, health insurance, self-reported general health, self-reported mental health, level of physical activity, smoker status, BMI, blood pressure, level of social connectedness, and specific health conditions including asthma, diabetes, hypertension, and depression. Previous articles published with this data include “A Place-Based Community Health Worker Program: Feasibility and Early Outcomes, New York City, 2015,” published in the American Journal of Preventive Medicine.
After completing the practicum, the residents worked together with Dr. Gillespie, Dr. Thorpe, and Mr. LaPolla to submit the manuscript for publication as co-authors. This case study in data re-use illustrates how the NYU Data Catalog fits into the data ecosystem, bridging connections between researchers and helping people locate relevant datasets. It also illustrates how important data re-use can be to young researchers and students, as it can provide access to data without the high cost of them having to collect it themselves, or pay for that data.