Skip to content

sobhanebr/va-apcd-maternal-claims

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pregnancy Episode Extraction from APCD Claims Data (Virginia, 2018–2022)

This repository contains the complete SQL pipeline developed to process Virginia's All-Payer Claims Database (APCD) and extract pregnancy episodes for downstream maternal health analyses. The pipeline supports high-precision cohort definition, time-interval labeling, and episode-based risk summarization.

Repository Structure

  • procedures/: SQL stored procedures implementing pregnancy episode inference, gestational labeling, risk evaluation, and cost aggregation.
  • table_definitions/: Definitions and data transformation queries for working tables, including flattening and filtering logic.
  • view_definitions/: SQL definitions for materialized or temporary views summarizing risk factors, outcomes, and costs.

Note: We assume all raw APCD tables are preloaded into a read-only database named src_db. This database remains unaltered (except for performance indexing). All transformation, filtering, and enrichment are performed in a separate writable database, nursing_production.

Execution Pipeline

  1. Mutable Table Construction
    Use mutable_facility.sql and mutable_professional.sql to extract relevant fields and generate working tables with write access.

  2. Exclude Reversed Claims
    Execute mark_revered_claims.sql to tag claims marked 'PAID' that have matching 'REVERSED' counterparts. This adds a matched column, used to filter out invalid records.

  3. Maternal Claims Extraction
    Run nursing_claims_excluding_reversals.sql to isolate and reformat records related to maternal care. This includes collapsing multiple ICD-10 fields into a single row-level representation for easier algorithmic processing.

  4. Gestational Age Estimation
    Use add_gestational_week_column.sql to infer gestational age (in weeks) from Z3A ICD-10 codes where available.

  5. Pregnancy Termination Detection
    Call assign_pregnancy_groups_advanced.sql to detect pregnancy terminations (births or losses) based on diagnosis codes. Each termination is validated and assigned an enumerated label such as 1#B or 2#L, representing ordered terminations of type Birth or Loss.

    • Births are considered valid if no other termination occurs in the prior 6 months.
    • Losses are considered valid if no other termination is observed in the prior 8 weeks.
  6. Hospitalization Data Enrichment
    Execute all_hospitalization_dates.sql to identify inpatient care windows. Then run extend_hospitalization_care.sql to associate terminations with full delivery episodes (from admission to discharge).

  7. Postpartum Labeling
    Call label_postpartum_intervals.sql to generate an 8-week isolation window following each valid termination. Claims in this window are tagged with identifiers such as PO#1.

  8. Prenatal Labeling
    Run label_prenatal_intervals.sql to mark the prenatal phase leading up to each termination:

    • When gestational age is available, pregnancy is back-calculated from the termination date.
    • Otherwise, a default 36-week interval is applied.
    • The start date is truncated to the data range (Jan 1, 2018 onward) if necessary. Labels such as PR#1 are assigned per-pregnancy.

🔍 Downstream Analytic Modules

  • total_paid_cleaned_excluding_reversals.sql: Aggregates daily payments per individual, enabling cost-related metrics.
  • pregnancy_total_paid_excluding_reversals.sql: Aggregates costs by date within each pregnancy episode.
  • merge_insurance_periods.sql: Merges month-level insurance coverage into continuous spans for patient-level enrollment modeling.
  • person_year_race_classified.sql: Infers race using demographic tables for cohort stratification.
  • risk_factor_existence_on_episode_mv.sql: Materialized view indicating presence of risk factor codes at prenatal, termination, and postpartum stages.
  • smm_indicator_summary.sql: Flags severe maternal morbidity (SMM) indicators in each stage of pregnancy.
  • risk_factors_summary.sql: Summarizes occurrence rates of risk conditions prior to delivery.
  • smm_summary.sql: Summarizes occurrence rates of SMM events per stage.
  • update_total_cost_sum.sql: Annotates each episode stage with corresponding total cost.
  • build_maternal_risk_profile_excl_rev.sql: Builds a comprehensive per-pregnancy risk profile, integrating SMM, comorbidities, risk scores, and outcome indicators. See CMQCC's Comorbidity Score for reference.

About

A reproducible pipeline for identifying and analyzing maternal healthcare episodes in Virginia’s All‐Payer Claims Database (APCD).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors