Skip to main content

Population Health Research at CPHIT

Explore population health research using health data, predictive modeling, and novel methods at the Johns Hopkins Center for Population Health IT

Our work at the Center spans three primary domains: predictive modeling and novel methods, big data, and social predictive modeling. 

Center researchers leverage insurance claims, electronic medical records, hospital discharges, social data, and 'big data' to advance population health research and interventions.

Predictive Modeling & Novel Methods

Developing Next Generation EHR-Supported Predictive Modeling: Developing the Johns Hopkins “e-ACG” System

CPHIT and the Johns Hopkins ACG R&D unit housed in the Center has a major project underway to use new clinical digital data streams to enhance current predictive and analytic models. This project is being done in collaboration with faculty from the Bloomberg School of Public Health, Johns Hopkins School of Medicine, and the Department of Computer Science. Some of the EHR elements that are being incorporated in advanced models include vital signs, lab values, cardiovascular data, clinician notes, and patient reports. The goal of this project is to advance the state of the art of EHR-based predictive modeling tools for high-risk case detection and management for populations. We will identify EHR and other HIT elements amenable to incorporation with traditional claims-based ACG measures and then test how to best integrate a combination of elements with the ACG System to enhance our predictive modeling ability. Over the various phases of this project, we will not only apply structured, readily available EHR/clinical data sources, we will also apply Natural Language Processing (NLP) text mining approaches to capture information from unstructured data sources. We are exploring other types of machine learning techniques to develop prediction models that can be applied on a dynamic real-time basis to augment clinical and population decision support systems.

The Development and Testing of the Frailty Component of a Novel EHR-Based “Geriatric e-risk” Measure for Predictive Modeling

The initial goal of this project is to develop advanced predictive modeling tools for high-risk case detection and management for geriatric populations and understanding the added value of free text to current predictive modeling. This project expanded the Johns Hopkins University ACG frailty metric system using structured data from both claims and EHRs. A unique aspect of this project was an extensive (and very successful) text mining of clinician free text notes extracted from the EHR records of 20,000+ Medicare Advantage patients at the study site. CPHIT used sophisticated "regex" text matching technique to reach very high accuracy levels of geriatric risk identification based on information in the text that was not found in either the structured claims or EHR. Johns Hopkins University computer science faculty used more complex advanced natural language processing techniques (NLP). Work has expanded to explore social determinants of health in a Medicare and Medicaid population.

Identifying Social and Determinants of Health in Electronic Health Records and Administrative Claims: Comparing Structured, Unstructured, and Geo-Derived Information

The overarching goal of this project is to identify social determinants of health (SDoH) in electronic health records (EHRs) and administrative claims for a Medicare and Medicaid population. We will continue to explore NLP techniques to mine free text of EHR and the clinical notes. We will identify patients who have a mention of an SDoH in EHR and claims such as housing instability or food insecurity. Using the census block group geographic level, we can identify patients who live in neighborhoods with high rates of SDoH issues (i.e., food deserts) and compare how neighborhood issues match individual issues recorded in the EHR. Results from this project will be used to help screen and identify patients with unmet SDoH needs.  

Big Data

Linkage of Rx, Medical, Corrections and Social Data to Identify Persons at Risk for Opioid Overdose and Other Adverse Effects

This is a "Harold Rogers" Grant from the US Department of Justice (DoJ) Bureau of Justice Assistance to the Maryland Department of Health and Mental Hygiene (DHMH) for the purposes of linking the PDMP (controlled drug prescribing database) and a wide range of medical, public health, social and justice/corrections data in the State of Maryland to develop a predictive model to identify persons at risk for opioid overdose risk. This is a collaborative effort of the DHMH, the State of Maryland Health Information Exchange (HIE) (known as CRISP), and CPHIT. JHU is the technical/methods/analytic lead. We will be developing predictive models using this broad array of risk data that will provide a risk score that can be used by clinicians and public health programs to potentially guide interventions that can decrease harm to individuals and populations. In addition to data linkage, technical, and analytic issues, we will grapple with legal and ethical frameworks for how to appropriately use these sensitive sources of data as well.

Addressing Suicide Research Gaps: Understanding Mortality Outcomes in the Mid-Atlantic Region Primary

The overall goal of this research project is to identify patterns of clinical encounters and characteristics of individuals who have committed suicide. We will link multiple different and novel databases at both the individual and census block group level. Once linked, we will develop methods to find novel patterns and predict suicide. Linked data sources will include: the Office of the Medical Examiner, various claims databases such as in-patient and out-patient data from the Health Service Cost Review Commission, and Electronic Health Record (EHR) from both Johns Hopkins Health System and Sheppard Pratt Health System, and others. We hope to also include other data sources such as the American Community Survey and Child Protective Services. Data linkage will occur in collaboration with the Maryland Health Information Exchange: CRISP. CRISP, using their master patient index will assign a study id to each data set allowing for the individual data sets to be linked at an individual level. The work is still in preliminary phases as we gather all the appropriate approvals for the various data sources.

Social Predictive Modeling

Geo Social Analytic Platform (GSAP)

CPHIT is developing a large Geo Social Analytic Platform (GSAP) database of publicly available data at the census tract level. This database will link American Community Survey (ACS), ArcGIS data (as available), road systems, and other geographic level data to design a database that can be utilized to understand non-medical factors associated with specific conditions, utilization, cost, etc.

DST GSAP: Collaborating with our Industry Partner, DST, we explored predicting hospitalization and Emergency Department (ED) admissions using administrative claims data and the ADI.

ACGSAP: CPHIT is embarking on a potential three-year project that will allow for the seamless exchange of data from the GSAP platform and ACG Software. This initiative will further explore how non-clinical factors can be used to improve population health.

Baltimore Falls Reduction Initiative Engaging Neighborhoods and Data (B’FRIEND) Primary

The emergence of new sources of data has created an unprecedented opportunity to improve public health. It is now possible to use real-time information from healthcare systems to assess community health, monitor progress through surveillance, develop advanced models to predict population-based morbidity trajectories, and innovate to deliver meaningful improvements in health outcomes. The B’FRIEND Initiative is a public-private partnership in Baltimore City based on the innovative use of health data to decrease the rate of falls leading to hospital admissions among the elderly. We are developing the methodology to identify falls in elderly patients using case-mix data and are working with the state HIE (CRISP) to implement these methods and start developing reports on when people are falling. Based on this methodology and geo-coded data, we will develop a methodology to identify where falls are occurring. We are also assessing the quality of GIS-bound datasets for hot-spotting methodology and testing spatiotemporal trajectories. Using both hospital and other community data we will develop a risk score for predicting falls and determine risk factors that may be associated with falls. This information will be shared, and we will work with CRISP and the city to implement these risk scores.

Social Predictive Modeling/Geographic Database Development

CPHIT is developing a large database of publicly available data at the census tract level. This database will link American Community Survey (ACS), ArcGIS data (as available), road systems, and other geographic level data. The goal is to design a database that can be utilized to understand non-medical factors associated with specific conditions, utilization, cost, etc. We intend on using this community level database both in support of R&D projects at CPHIT and across JHU. Preliminary analysis is being conducted using commercial claims data from the Maryland Healthcare Cost Commission and linking to the Area Deprivation Index (ADI) (a composite score exploring the neighborhood socio-economic status). There are a number of different projects that fall under this large umbrella, for more information see below. a) DST GSAP Collaborating with our Industry Partner, DST, we explored predicting hospitalization and Emergency Department (ED) admissions using administrative claims data and the ADI. Preliminary analysis is conducted at the zip code level. More work is needed to explore how non-clinical factors such as the ADI may help in predicting various health outcomes and identify patients with various social needs. We continue to explore various spatial statistical operations and integration techniques b) ACGSAP CPHIT is embarking on a potential three-year project that will allow for the seamless exchange of data from the GSAP platform and ACG Software. This initiative will further explore how non-clinical factors can be used to improve population health, from identifying patients with unmet social needs, to providing patients with information about community organizations that can assist on a variety of social needs. The goal is to provide end users with the desired and needed geo-social variables and markers as well as risk scores that can enhance population health management efforts.