This repository contains the complete SQL pipeline developed to process Virginia's All-Payer Claims Database (APCD) and extract pregnancy episodes for downstream maternal health analyses. The pipeline supports high-precision cohort definition, time-interval labeling, and episode-based risk summarization.
procedures/: SQL stored procedures implementing pregnancy episode inference, gestational labeling, risk evaluation, and cost aggregation.table_definitions/: Definitions and data transformation queries for working tables, including flattening and filtering logic.view_definitions/: SQL definitions for materialized or temporary views summarizing risk factors, outcomes, and costs.
Note: We assume all raw APCD tables are preloaded into a read-only database named src_db. This database remains unaltered (except for performance indexing). All transformation, filtering, and enrichment are performed in a separate writable database, nursing_production.
-
Mutable Table Construction
Usemutable_facility.sqlandmutable_professional.sqlto extract relevant fields and generate working tables with write access. -
Exclude Reversed Claims
Executemark_revered_claims.sqlto tag claims marked 'PAID' that have matching 'REVERSED' counterparts. This adds amatchedcolumn, used to filter out invalid records. -
Maternal Claims Extraction
Runnursing_claims_excluding_reversals.sqlto isolate and reformat records related to maternal care. This includes collapsing multiple ICD-10 fields into a single row-level representation for easier algorithmic processing. -
Gestational Age Estimation
Useadd_gestational_week_column.sqlto infer gestational age (in weeks) fromZ3AICD-10 codes where available. -
Pregnancy Termination Detection
Callassign_pregnancy_groups_advanced.sqlto detect pregnancy terminations (births or losses) based on diagnosis codes. Each termination is validated and assigned an enumerated label such as1#Bor2#L, representing ordered terminations of type Birth or Loss.- Births are considered valid if no other termination occurs in the prior 6 months.
- Losses are considered valid if no other termination is observed in the prior 8 weeks.
-
Hospitalization Data Enrichment
Executeall_hospitalization_dates.sqlto identify inpatient care windows. Then runextend_hospitalization_care.sqlto associate terminations with full delivery episodes (from admission to discharge). -
Postpartum Labeling
Calllabel_postpartum_intervals.sqlto generate an 8-week isolation window following each valid termination. Claims in this window are tagged with identifiers such asPO#1. -
Prenatal Labeling
Runlabel_prenatal_intervals.sqlto mark the prenatal phase leading up to each termination:- When gestational age is available, pregnancy is back-calculated from the termination date.
- Otherwise, a default 36-week interval is applied.
- The start date is truncated to the data range (Jan 1, 2018 onward) if necessary.
Labels such as
PR#1are assigned per-pregnancy.
total_paid_cleaned_excluding_reversals.sql: Aggregates daily payments per individual, enabling cost-related metrics.pregnancy_total_paid_excluding_reversals.sql: Aggregates costs by date within each pregnancy episode.merge_insurance_periods.sql: Merges month-level insurance coverage into continuous spans for patient-level enrollment modeling.person_year_race_classified.sql: Infers race using demographic tables for cohort stratification.risk_factor_existence_on_episode_mv.sql: Materialized view indicating presence of risk factor codes at prenatal, termination, and postpartum stages.smm_indicator_summary.sql: Flags severe maternal morbidity (SMM) indicators in each stage of pregnancy.risk_factors_summary.sql: Summarizes occurrence rates of risk conditions prior to delivery.smm_summary.sql: Summarizes occurrence rates of SMM events per stage.update_total_cost_sum.sql: Annotates each episode stage with corresponding total cost.build_maternal_risk_profile_excl_rev.sql: Builds a comprehensive per-pregnancy risk profile, integrating SMM, comorbidities, risk scores, and outcome indicators. See CMQCC's Comorbidity Score for reference.