LDI Health Services Research Data Center

High Security Data Service for LDI Senior Fellows


Established in 2005, the LDI Health Services Research Data Center (HSRDC) provides data services to LDI-affiliated investigators who use highly sensitive patient information in their research. The HSRDC is comprised of secure high-performance servers within the University of Pennsylvania’s Perelman School of Medicine that have the necessary security protections to permit storage and analysis of data containing Protected Health Information by LDI-affiliated investigators and research staff.

As it is difficult for any one investigator to sustain such a costly resource, LDI established the HSRDC in 2005 to permit multiple externally funded LDI investigators to pool resources and build a unified, synergized analytic platform. LDI provides sustaining support to HSRDC to bridge the inevitable fluctuations in investigator funding, permits mentored trainees to use the system in collaboration with funded LDI-affiliated investigators, and allows unfunded researchers (LDI Fellows and post-docs) to perform pilot research in preparation of grant applications or students to conduct studies for their thesis.

Data that are Appropriate for HSRDC

Since this computational environment is maintained at a high security level in accordance with federal regulations governing secure computer systems (e.g., the Federal Information Security Management Act-FISMA), the HSRDC is limited solely to research using data that require high security (e.g., data with individually identifiable, protected health information). Lower security datasets (e.g., anonymous patient surveys, de-identified data, publicly-available data, etc.) should be stored and analyzed more economically on other University resources, such as Penn+Box, etc.

The HSRDC server cluster is generally designed for analysis of “complete” databases that are uploaded once by our IT administrators with no more frequent updates than annual “refreshes.” Since users are not allowed to upload their own data, the HSRDC is not an appropriate storage infrastructure for data from clinical trials, surveys, or other data sources that require more frequent updates.

Documentation Requirements

If the data that are to be stored and analyzed on the HSRDC includes Protected Health Information that was obtained outside of the University of Pennsylvania, a Data Use Agreement (DUA) specifically permitting storage of the data on the HSRDC is needed in order to comply with federal regulations (HIPAA). HSRDC staff must be provided with an executed (signed) version of the DUA before data will be uploaded, as well as appropriate IRB documentation. Current DUA documentation must be sent to HSRDC staff annually.

Technical Details

HSRDC servers utilize RedHat Linux operating systems, and data can be stored and analyzed using SAS, Stata, and R. Please note that “Windows-based” programs such as Microsoft Excel, Microsoft Access, etc. are not available in the HSRDC environment. Users wishing to use other software (such as Python or MySQL) on the server are fully responsible for the licensing costs, and installation of the specialized software will be evaluated by HSRDC staff on a case-by-case basis.