diff --git a/.Rbuildignore b/.Rbuildignore index 3e94061..a5696a8 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -11,3 +11,5 @@ ^cran-comments\.md$ ^CRAN-RELEASE$ ^\.github$ +^inst/icd/2017 +^inst/icd/2018 diff --git a/.gitignore b/.gitignore index ad1727e..b69b597 100644 --- a/.gitignore +++ b/.gitignore @@ -29,3 +29,4 @@ vignettes/*.pdf *.swo .Rproj.user inst/doc +.DS_Store diff --git a/DESCRIPTION b/DESCRIPTION index f7f7095..f9c5684 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: pccc Title: Pediatric Complex Chronic Conditions -Version: 1.0.5.9000 +Version: 1.0.6 Authors@R: c( person(given = "Peter", family = "DeWitt", email = "dewittpe@gmail.com", role = c("aut"), comment = c(ORCID = "0000-0002-6391-0795")), person(given = "Tell", family = "Bennett", email = "tell.bennett@cuanschutz.edu", role = c("ctb"), comment = c(ORCID = "0000-0003-1483-4236")), @@ -12,6 +12,7 @@ Description: An implementation of the pediatric complex chronic conditions (CCC) Depends: R (>= 3.5.0) License: GPL-2 Encoding: UTF-8 +Language: en-us LazyData: true Imports: dplyr (>= 1.0.0), diff --git a/NEWS.md b/NEWS.md index 99040ef..89839b6 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,19 @@ -# Version 1.0.5.9000 +# Version 1.0.6 + +## Bug fixes + +* Code fixes -- most of these codes were flagging transplant but not the primary organ system; these were present in original publication but missing from this package + * 996.83 added to CVD - it was flagging transplant but not CVD + * 996.84 added to respiratory + * 996.81 added to renal + * 996.82, 996.86, 996.87 added to gi + * ICD10 codes added to CVD: T86.20, T86.21, T86.22 + * M4330 is not a valid code, corrected to M433 (impacts congeni_genetic) + * Add Z94.0 to transplant + +## Extenstions +* Update old urls and adding some links to wayback machine snapshots +* Add key files to the package for reference ## New functions: * S3 method `as_tibble.pccc_codes` has been added to the package to replace the diff --git a/R/data.R b/R/data.R index 98e6d85..f75eb75 100644 --- a/R/data.R +++ b/R/data.R @@ -4,7 +4,7 @@ #' \url{https://github.com/magic-lantern/icd_file_generator}. ICD codes were taken #' from CMS. The ICD 9 diagnosis and procedure codes were generated with 20% #' missing values. Code source: -#' \url{https://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html} +#' \url{https://www.cms.gov/medicare/coding-billing/icd-10-codes/icd-9-cm-diagnosis-procedure-codes-abbreviated-and-full-code-titles} #' #' @format A data frame with 1000 rows and 31 variables. #' There is a patient identifier, ten diagnosis codes, ten procedure codes, and @@ -51,8 +51,11 @@ #' This dataset was produced from a tool available at #' \url{https://github.com/magic-lantern/icd_file_generator}. ICD codes were taken #' from CMS. The code source, for both the diagnosis and produced codes can be -#' found at \url{https://www.cms.gov/Medicare/Coding/ICD10/2017-ICD-10-CM-and-GEMs.html} -#' +#' found at +#' \url{https://www.cms.gov/medicare/coding-billing/icd-10-codes/icd-10-cm-icd-10-pcs-gem-archive} +#' with a copy of the downloaded data on the package github page, +#' \url{https://github.com/CUD2V/pccc} +#' #' @format A data frame with 1000 rows and 31 variables. #' There is a patient identifier, ten diagnosis codes, ten procedure codes, and #' ten "other data" values, specifically: diff --git a/R/pccc-package.R b/R/pccc-package.R index 096f598..31ba955 100644 --- a/R/pccc-package.R +++ b/R/pccc-package.R @@ -15,15 +15,15 @@ #' For ease, a copy of the paper is included in this package. See the examples #' below for instructions on opening this pdf from within R or outside of R. #' You can view the publication online at -#' \url{http://bmcpediatr.biomedcentral.com/articles/10.1186/1471-2431-14-199}. +#' \doi{10.1186/1471-2431-14-199}. #' #' Feudtner et. al. provided a SAS macro and STATA program to implement the CCC. #' These files are also provided for reference. See the Examples for #' instructions on opening these files. #' #' Lastly, the appendix tables in the file -#' Categories_of_CCCv2_and_Corresponding_ICD.docx have also been included with -#' this package. +#' \code{system.file("pccc_references", "Categories_of_CCCv2_and_Corresponding_ICD.docx", package = "pccc")} +#' have also been included with this package. #' #' @examples #' \dontrun{ diff --git a/README.md b/README.md index e273526..7abd08f 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ on large datasets. This package provides R functions to generate the CCC categories. Because the R functions are built with a C++ back-end, they are very computationally efficient. -The pccc package version 1.0.z implimented this version of the PCCC criteria. +The pccc package version 1.0.z implemented this version of the PCCC criteria. ### Version 3 of the PCCC Criteria [Pediatric Complex Chronic Condition System Version 3](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2821158) @@ -49,7 +49,7 @@ match the system version) will implement version 3 of the PCCC. ## Installation ### From CRAN -Relased version available on The Comprehensive R Archive Network at https://CRAN.R-project.org/package=pccc. +Released version available on The Comprehensive R Archive Network at https://CRAN.R-project.org/package=pccc. ### Developmental version diff --git a/cran-comments.md b/cran-comments.md index 9636219..898abeb 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -1,12 +1,15 @@ +# Version 1.0.6 + +## Notes: + +* change in the email address of the package mantainer. The ucdenver.edu domain + as been replaced by the cuanzchutz.edu domain + ## Test environments -* local macOS 10.14.6 install R 3.6.3 -* ubuntu 16.04 on travis-ci (oldrel, release, devel) -* win-builder (devel, release, oldrelease) -* r_hub - * Ubuntu Linux 16.04 LTS, R-release, GCC - * Windows Server 2008 R2 SP1, R-devel, 32/64 bit - * Fedora Linux, R-devel, clang, gfortran - * Debian Linux, R-devel, GCC ASAN/UBSAN +* local macOS R 4.5.0 +* +* +* ## R CMD check results diff --git a/inst/WORDLIST b/inst/WORDLIST new file mode 100644 index 0000000..b90b711 --- /dev/null +++ b/inst/WORDLIST @@ -0,0 +1,34 @@ +Zhong +Wenjun +Cholera due to vibrio cholerae el tor +vctrs +rlang +ORCID +testthat +Urologic +stringsAsFactors +STATA +ICD +rda +rds +Feudtner +CCCs +CMD +CMS +cvd +CVD +Dai +CCC +ccc +BMC +Dingwei +docx +dplyr +github +icd +Kalibera +MCOD +AddressSanitizer +deprecations +Deprecations +dystrophy diff --git a/inst/icd/2017/2017-GEM-DC.zip b/inst/icd/2017/2017-GEM-DC.zip new file mode 100644 index 0000000..9bd9dd3 Binary files /dev/null and b/inst/icd/2017/2017-GEM-DC.zip differ diff --git a/inst/icd/2017/2017-ICD-10-CM-Conversion-Table-.zip b/inst/icd/2017/2017-ICD-10-CM-Conversion-Table-.zip new file mode 100644 index 0000000..d9a6aed Binary files /dev/null and b/inst/icd/2017/2017-ICD-10-CM-Conversion-Table-.zip differ diff --git a/inst/icd/2017/2017-ICD-10-CM-Guidelines.pdf b/inst/icd/2017/2017-ICD-10-CM-Guidelines.pdf new file mode 100644 index 0000000..8379097 Binary files /dev/null and b/inst/icd/2017/2017-ICD-10-CM-Guidelines.pdf differ diff --git a/inst/icd/2017/2017-ICD10-Addendum.zip b/inst/icd/2017/2017-ICD10-Addendum.zip new file mode 100644 index 0000000..fbe864b Binary files /dev/null and b/inst/icd/2017/2017-ICD10-Addendum.zip differ diff --git a/inst/icd/2017/2017-ICD10-Code-Descriptions.zip b/inst/icd/2017/2017-ICD10-Code-Descriptions.zip new file mode 100644 index 0000000..be6400a Binary files /dev/null and b/inst/icd/2017/2017-ICD10-Code-Descriptions.zip differ diff --git a/inst/icd/2017/2017-ICD10-Code-Tables-Index.zip b/inst/icd/2017/2017-ICD10-Code-Tables-Index.zip new file mode 100644 index 0000000..1921573 Binary files /dev/null and b/inst/icd/2017/2017-ICD10-Code-Tables-Index.zip differ diff --git a/inst/icd/2017/2017-ICD10-Duplicate-Code-.zip b/inst/icd/2017/2017-ICD10-Duplicate-Code-.zip new file mode 100644 index 0000000..3be6eb1 Binary files /dev/null and b/inst/icd/2017/2017-ICD10-Duplicate-Code-.zip differ diff --git a/inst/icd/2017/2017-ICD10-POA-Exempt.zip b/inst/icd/2017/2017-ICD10-POA-Exempt.zip new file mode 100644 index 0000000..95778f7 Binary files /dev/null and b/inst/icd/2017/2017-ICD10-POA-Exempt.zip differ diff --git a/inst/icd/2017/screen_shot_maybackmachine_cms_2017_icd_10_cm_and_gems.png b/inst/icd/2017/screen_shot_maybackmachine_cms_2017_icd_10_cm_and_gems.png new file mode 100644 index 0000000..4e47e43 Binary files /dev/null and b/inst/icd/2017/screen_shot_maybackmachine_cms_2017_icd_10_cm_and_gems.png differ diff --git a/inst/icd/2018/2018-ICD-10-Addendum.zip b/inst/icd/2018/2018-ICD-10-Addendum.zip new file mode 100644 index 0000000..e324b19 Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-Addendum.zip differ diff --git a/inst/icd/2018/2018-ICD-10-CM-Coding-Guidelines.pdf b/inst/icd/2018/2018-ICD-10-CM-Coding-Guidelines.pdf new file mode 100644 index 0000000..09a95e2 Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-CM-Coding-Guidelines.pdf differ diff --git a/inst/icd/2018/2018-ICD-10-CM-Conversion-Table.zip b/inst/icd/2018/2018-ICD-10-CM-Conversion-Table.zip new file mode 100644 index 0000000..3a17a96 Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-CM-Conversion-Table.zip differ diff --git a/inst/icd/2018/2018-ICD-10-CM-Errata.pdf b/inst/icd/2018/2018-ICD-10-CM-Errata.pdf new file mode 100644 index 0000000..75dcac8 Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-CM-Errata.pdf differ diff --git a/inst/icd/2018/2018-ICD-10-CM-General-Equivalence-Mappings.zip b/inst/icd/2018/2018-ICD-10-CM-General-Equivalence-Mappings.zip new file mode 100644 index 0000000..c76970c Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-CM-General-Equivalence-Mappings.zip differ diff --git a/inst/icd/2018/2018-ICD-10-Code-Descriptions.zip b/inst/icd/2018/2018-ICD-10-Code-Descriptions.zip new file mode 100644 index 0000000..93322b3 Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-Code-Descriptions.zip differ diff --git a/inst/icd/2018/2018-ICD-10-Table-And-Index.zip b/inst/icd/2018/2018-ICD-10-Table-And-Index.zip new file mode 100644 index 0000000..93726de Binary files /dev/null and b/inst/icd/2018/2018-ICD-10-Table-And-Index.zip differ diff --git a/inst/icd/2018/2018-POA-Exempt-Codes.zip b/inst/icd/2018/2018-POA-Exempt-Codes.zip new file mode 100644 index 0000000..3da3092 Binary files /dev/null and b/inst/icd/2018/2018-POA-Exempt-Codes.zip differ diff --git a/inst/icd/2018/screen_shot_waybackmachine_cms_2018_icd_10_cm_and_gems.png b/inst/icd/2018/screen_shot_waybackmachine_cms_2018_icd_10_cm_and_gems.png new file mode 100644 index 0000000..7713a68 Binary files /dev/null and b/inst/icd/2018/screen_shot_waybackmachine_cms_2018_icd_10_cm_and_gems.png differ diff --git a/inst/icd/ICD9_ICD10_comparability_file_documentation.pdf b/inst/icd/ICD9_ICD10_comparability_file_documentation.pdf new file mode 100644 index 0000000..bfa85bb Binary files /dev/null and b/inst/icd/ICD9_ICD10_comparability_file_documentation.pdf differ diff --git a/man/pccc-package.Rd b/man/pccc-package.Rd index 5b32dd4..0bec943 100644 --- a/man/pccc-package.Rd +++ b/man/pccc-package.Rd @@ -16,15 +16,15 @@ The original paper, Feudtner C, et al. (2014), was publish with open access. For ease, a copy of the paper is included in this package. See the examples below for instructions on opening this pdf from within R or outside of R. You can view the publication online at -\url{http://bmcpediatr.biomedcentral.com/articles/10.1186/1471-2431-14-199}. +\doi{10.1186/1471-2431-14-199}. Feudtner et. al. provided a SAS macro and STATA program to implement the CCC. These files are also provided for reference. See the Examples for instructions on opening these files. Lastly, the appendix tables in the file -Categories_of_CCCv2_and_Corresponding_ICD.docx have also been included with -this package. +\code{system.file("pccc_references", "Categories_of_CCCv2_and_Corresponding_ICD.docx", package = "pccc")} +have also been included with this package. } \examples{ diff --git a/man/pccc_icd10_dataset.Rd b/man/pccc_icd10_dataset.Rd index 399100f..a63b0c1 100644 --- a/man/pccc_icd10_dataset.Rd +++ b/man/pccc_icd10_dataset.Rd @@ -49,6 +49,9 @@ pccc_icd10_dataset This dataset was produced from a tool available at \url{https://github.com/magic-lantern/icd_file_generator}. ICD codes were taken from CMS. The code source, for both the diagnosis and produced codes can be -found at \url{https://www.cms.gov/Medicare/Coding/ICD10/2017-ICD-10-CM-and-GEMs.html} +found at +\url{https://www.cms.gov/medicare/coding-billing/icd-10-codes/icd-10-cm-icd-10-pcs-gem-archive} +with a copy of the downloaded data on the package github page, +\url{https://github.com/CUD2V/pccc} } \keyword{datasets} diff --git a/man/pccc_icd9_dataset.Rd b/man/pccc_icd9_dataset.Rd index bfcd04b..9cae637 100644 --- a/man/pccc_icd9_dataset.Rd +++ b/man/pccc_icd9_dataset.Rd @@ -50,6 +50,6 @@ This dataset was produced from a tool available at \url{https://github.com/magic-lantern/icd_file_generator}. ICD codes were taken from CMS. The ICD 9 diagnosis and procedure codes were generated with 20% missing values. Code source: -\url{https://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html} +\url{https://www.cms.gov/medicare/coding-billing/icd-10-codes/icd-9-cm-diagnosis-procedure-codes-abbreviated-and-full-code-titles} } \keyword{datasets} diff --git a/src/pccc.cpp b/src/pccc.cpp index 9802741..6f98d03 100644 --- a/src/pccc.cpp +++ b/src/pccc.cpp @@ -25,18 +25,26 @@ codes::codes(int v) "74781","74789","9960","9961","99661","99662","V421","V422","V432","V433","V450","V4581", "V533"}; - dx_fixed_cvd = {"416"}; + dx_fixed_cvd = {"416", + "99683" // transplant too + }; dx_respiratory = {"32725","4160","4162","51630","51631","51637","51884","5190","2770","748","7704","V426", "V440","V4576","V460","V461","V550"}; - dx_fixed_respiratory = {"5163"}; + dx_fixed_respiratory = {"5163", + "99684" // transplant too + }; - dx_renal = {"34461","585","5964","59653","59654","753","99668","V420","V445","V446", + dx_renal = {"34461","585","5964","59653","59654","753","99668", + "99681", //transplant + "V420","V445","V446", "V451","V4573","V4574","V536","V555","V556","V56"}; dx_gi = {"4530","5364","555","556","5571","5602","5647","5714","5715","5716","5717", - "5718","5719","7503","751","V427","V4283","V4284","V441","V442","V443","V444","V5350", + "5718","5719","7503","751", + "99682", "99686", "99687", // transplant + "V427","V4283","V4284","V441","V442","V443","V444","V5350", "V5351","V5359","V551","V552","V553","V554"}; dx_hemato_immu = {"042","043","044","135","279","2820","2821","2822","2823","2824","2825", @@ -126,6 +134,25 @@ codes::codes(int v) "T85190A","T85192A","T85199A","T8579XA","Z982","Z4541","Z4542", "Q851"}; dx_fixed_neuromusc = {"G80"}; + // NOTE: G80 and G80.3 are listed in pccc/inst/pccc_references/Categories_of_CCCv2_and_Corresponding_ICD.docx + // However, it appears that during some dev work the following was + // determined for PCCC V2 + // G80.0 -- include + // G80.1 -- include + // G80.2 -- exclude + // G80.3 -- include + // G80.4 -- include + // G80.5 -- (not a valid ICD code) + // G80.6 -- (not a valid ICD code) + // G80.7 -- (not a valid ICD code) + // G80.8 -- include + // G80.9 -- exclude + // + // For version 3, all the G80.x codes will be included + // See Supplement 3 from + // Feinstein JA, Hall M, Davidson A, Feudtner C. Pediatric Complex Chronic + // Condition System Version 3. JAMA Netw Open. 2024;7(7):e2420579. + // doi:10.1001/jamanetworkopen.2024.20579 dx_cvd = {"I270","I271","I272","I2781","I2789","I279","I340","I348","I360","I368","I370", "I378","I42","I43","I44","I45","I47","I48","I490","I491","I493","I494","I495","I498","I499", @@ -134,6 +161,7 @@ codes::codes(int v) "Z951","T82519A","T82529A","T82539A","T82599A","T82110A","T82111A","T82120A","T82121A", "T82190A","T82191A","T8201XA","T8202XA","T8203XA","T8209XA","T82211A","T82212A","T82213A", "T82218A","T82221A","T82222A","T82223A","T82228A","T82518A","T82528A","T82538A","T82598A", + "T8620", "T8621", "T8622", "T826XXA","T827XXA","Z941","Z950","Z952","Z95810","Z95811","Z95812","Z95818","Z953","Z45010", "Z45018","Z4502","Z4509","Z959"}; @@ -163,7 +191,7 @@ codes::codes(int v) "E830","E831","E833","E834","E88","H498","E85","E009","E230","E232","E222","E233","E237", "E240","E242","E243","E248","E249","E2681","E250","E258","E259","Z4681","Z794","Z9641"}; - dx_congeni_genetic = {"E343","K449","M410","M412","M4130","M418","M419","M4330","M965","Q722","Q750", + dx_congeni_genetic = {"E343","K449","M410","M412","M4130","M418","M419","M433","M965","Q722","Q750", "Q752","Q759","Q760","Q761","Q762","Q764","Q765","Q766","Q767","Q77","Q780","Q781","Q782", "Q783","Q784","Q788","Q789","Q790","Q791","Q792","Q793","Q794","Q799","Q795","Q8740","Q8781", "Q8789","Q897","Q899","Q909","Q913","Q914","Q917","Q928","Q93","Q950","Q969","Q97","Q98", @@ -195,7 +223,7 @@ codes::codes(int v) dx_transplant = {"T8600","T8601","T8602","T8609","T8610","T8611","T8612","T8620","T8621","T8622", "T86810","T86811","T86819","T8640","T8641","T8642","T86890","T86891","T86899","T86850", - "T86851","T86859","T865","T8690","T8691","T8692","T8699"}; + "T86851","T86859","T865","T8690","T8691","T8692","T8699","Z940"}; pc_neuromusc = {"0016070","0016071","0016072","0016073","0016074","0016075","0016076","0016077", "0016078","001607B","00160J0","00160J1","00160J2","00160J4","00160J5","00160J6","00160J7", diff --git a/tests/icd10_test_result.rds b/tests/icd10_test_result.rds index 36ce2e2..1838e21 100644 Binary files a/tests/icd10_test_result.rds and b/tests/icd10_test_result.rds differ diff --git a/tests/icd9_test_result.rds b/tests/icd9_test_result.rds index c1c182f..4724328 100644 Binary files a/tests/icd9_test_result.rds and b/tests/icd9_test_result.rds differ diff --git a/tests/test_ccc_icd10.R b/tests/test_ccc_icd10.R index a7cfcde..1afc348 100644 --- a/tests/test_ccc_icd10.R +++ b/tests/test_ccc_icd10.R @@ -21,7 +21,7 @@ ccc_out <- ccc(data.frame(id = letters[1:3], ccc_out$id <- as.factor(ccc_out$id) rnd_test <- readRDS("random_data_test_result.rds") rnd_test$id <- as.factor(rnd_test$id) -stopifnot(all.equal(ccc_out, rnd_test)) +stopifnot(isTRUE(all.equal(ccc_out, rnd_test))) #test_that("icd 10 data set with all parameters - result should be unchanged.", { @@ -33,5 +33,10 @@ df <- pc_cols = dplyr::starts_with("pc"), icdv = 10) -stopifnot(all.equal(df, readRDS("icd10_test_result.rds"))) +expected <- readRDS("icd10_test_result.rds") +stopifnot(isTRUE(all.equal(df, expected))) + +################################################################################ +# End of File # +################################################################################ diff --git a/tests/test_ccc_icd9.R b/tests/test_ccc_icd9.R index 70ce5d4..2cdf520 100644 --- a/tests/test_ccc_icd9.R +++ b/tests/test_ccc_icd9.R @@ -1,4 +1,4 @@ -#i Overview of tests for ccc(): +# Overview of tests for ccc(): # ICD 9 # X invalid input (not real ICD codes) # X check output for saved file - if it changes, I want to know @@ -15,9 +15,14 @@ # # run tests with Ctrl/Cmd + Shift + T or devtools::test() # for manually running, execute -# library(testthat) library(pccc) +# set useFancyQuotes = FALSE so that in both interactive and non-interactive +# modes output will be consistent. Without this the double quote will be +# unicode 34 in non-interactie and unicode 8802 in interactive. +options(useFancyQuotes = FALSE) + + # context("PCCC - ccc ICD9 function tests") # Basic checks of standard output --------------------------------------------- @@ -37,7 +42,8 @@ stopifnot(identical(ncol(df), 14L)) # None of these should result in an error ------------------------------------- # "icd 9 data set with all parameters - result should be unchanged." -stopifnot(all.equal(df, readRDS("icd9_test_result.rds"))) +expected <- readRDS("icd9_test_result.rds") +stopifnot(isTRUE(all.equal(df, expected))) # "icd 9 data set with missing id parameter" stopifnot( @@ -56,8 +62,10 @@ df <- ccc(pccc_icd9_dataset[, c(1:21)], id = id, pc_cols = dplyr::starts_with("pc"), icdv = 9) + # this test will pass in non-interactive mode, the strings are slightly # different in interactive mode. The difference is the quotation marks used. +# SOLUTION TO THE QUOTE ISSUE: options(useFancyQuotes = FALSE) stopifnot( identical( all.equal(df, readRDS("icd9_test_result.rds")) @@ -78,28 +86,30 @@ stopifnot( ) ) + # "icd 9 data set with missing pc parameter" -stopifnot(identical( - all.equal( - ccc(pccc_icd9_dataset[, c(1:21)], - id = id, - pc_cols = dplyr::starts_with("dx"), - icdv = 9), - readRDS("icd9_test_result.rds")), - c("Component \"neuromusc\": Mean relative difference: 35", - "Component \"cvd\": Mean relative difference: 3", - "Component \"respiratory\": Mean relative difference: 2.487179", - "Component \"renal\": Mean relative difference: 8.8", - "Component \"gi\": Mean relative difference: 3.75", - "Component \"hemato_immu\": Mean relative difference: 3.772727", - "Component \"metabolic\": Mean relative difference: 6", - "Component \"congeni_genetic\": Mean absolute difference: 1", - "Component \"malignancy\": Mean absolute difference: 1", - "Component \"neonatal\": Mean absolute difference: 1", - "Component \"tech_dep\": Mean relative difference: 4.024096", - "Component \"transplant\": Mean relative difference: 2.512821", - "Component \"ccc_flag\": Mean relative difference: 9.371429") - ) +out <- + all.equal( + ccc(pccc_icd9_dataset[, c(1:21)], + id = id, + pc_cols = dplyr::starts_with("dx"), + icdv = 9), + readRDS("icd9_test_result.rds")) + +stopifnot( + grepl("^Component \"neuromusc\": Mean relative difference: 35", out[1]) + , grepl("^Component \"cvd\": Mean relative difference: 3", out[2]) + , grepl("^Component \"respiratory\": Mean relative difference: 2", out[3]) + , grepl("^Component \"renal\": Mean relative difference: 8", out[4]) + , grepl("^Component \"gi\": Mean relative difference: 3", out[5]) + , grepl("^Component \"hemato_immu\": Mean relative difference: 3", out[6]) + , grepl("^Component \"metabolic\": Mean relative difference: 6", out[7]) + , grepl("^Component \"congeni_genetic\": Mean absolute difference: 1", out[8]) + , grepl("^Component \"malignancy\": Mean absolute difference: 1", out[9]) + , grepl("^Component \"neonatal\": Mean absolute difference: 1", out[10]) + , grepl("^Component \"tech_dep\": Mean relative difference: 4", out[11]) + , grepl("^Component \"transplant\": Mean relative difference: 2", out[12]) + , grepl("^Component \"ccc_flag\": Mean relative difference: 9", out[13]) ) # should not be equal, and should have many differences @@ -139,8 +149,7 @@ ccc_out <- ccc(data.frame(id = letters[1:3], ccc_out$id <- as.factor(ccc_out$id) rnd_test <- readRDS("random_data_test_result.rds") rnd_test$id <- as.factor(rnd_test$id) - -stopifnot(all.equal(ccc_out, rnd_test)) +stopifnot(isTRUE(all.equal(ccc_out, rnd_test))) # Need to do some sort of performance test here - don't throw error, # but keep track of about how long this takes to run diff --git a/tests/test_get_codes.R b/tests/test_get_codes.R index dbdd5bd..3091d8e 100644 --- a/tests/test_get_codes.R +++ b/tests/test_get_codes.R @@ -35,19 +35,19 @@ expected_counts <- read.table(sep = "|", header = TRUE, strip.white = TRUE, " category | type | icdv | icd congeni_genetic | dx | 9 | 15 cvd | dx | 9 | 50 - gi | dx | 9 | 29 + gi | dx | 9 | 32 hemato_immu | dx | 9 | 34 malignancy | dx | 9 | 10 metabolic | dx | 9 | 25 neonatal | dx | 9 | 33 neuromusc | dx | 9 | 54 - renal | dx | 9 | 17 + renal | dx | 9 | 18 respiratory | dx | 9 | 17 tech_dep | dx | 9 | 43 transplant | dx | 9 | 21 - cvd | dx_fixed | 9 | 1 + cvd | dx_fixed | 9 | 2 neuromusc | dx_fixed | 9 | 2 - respiratory | dx_fixed | 9 | 1 + respiratory | dx_fixed | 9 | 2 cvd | pc | 9 | 53 gi | pc | 9 | 41 hemato_immu | pc | 9 | 12 @@ -60,7 +60,7 @@ expected_counts <- read.table(sep = "|", header = TRUE, strip.white = TRUE, transplant | pc | 9 | 30 metabolic | pc_fixed | 9 | 1 congeni_genetic | dx | 10 | 55 - cvd | dx | 10 | 97 + cvd | dx | 10 | 100 gi | dx | 10 | 53 hemato_immu | dx | 10 | 42 malignancy | dx | 10 | 31 @@ -70,7 +70,7 @@ expected_counts <- read.table(sep = "|", header = TRUE, strip.white = TRUE, renal | dx | 10 | 30 respiratory | dx | 10 | 28 tech_dep | dx | 10 | 135 - transplant | dx | 10 | 27 + transplant | dx | 10 | 28 neuromusc | dx_fixed | 10 | 1 cvd | pc | 10 | 142 gi | pc | 10 | 139 @@ -86,7 +86,8 @@ expected_counts <- read.table(sep = "|", header = TRUE, strip.white = TRUE, stopifnot( identical( - aggregate(icd ~ category + type + icdv, data = df, FUN = length), + aggregate(icd ~ category + type + icdv, data = df, FUN = length) + , expected_counts ) ) diff --git a/vignettes/pccc-example.Rmd b/vignettes/pccc-example.Rmd index a3be902..9008fce 100644 --- a/vignettes/pccc-example.Rmd +++ b/vignettes/pccc-example.Rmd @@ -29,9 +29,14 @@ library(dplyr) # Accessing the Data -The Center for Disease Control maintains vital statistics including death certificate data. The publicly available death certificate data, known as the Multiple Cause of Death (MCD) file, contain ICD diagnostic codes specifying the diseases and conditions leading to each decedent's death. In particular, the 1996 MCD data contain both ICD-9-CM and ICD-10 codes, making it an ideal example to demonstrate how the PCCC software categorizes ICD codes. Please note that because of the way ICD-9-CM codes are mapped to ICD-10-CM codes (https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html), the calculated frequencies of CCCs may differ between corresponding ICD-9-CM and ICD-10-CM diagnosis codes for a decedent. +The Center for Disease Control maintains vital statistics including death certificate data. The publicly available death certificate data, known as the Multiple Cause of Death (MCD) file, contain ICD diagnostic codes specifying the diseases and conditions leading to each decedent's death. In particular, the 1996 MCD data contain both ICD-9-CM and ICD-10 codes, making it an ideal example to demonstrate how the PCCC software categorizes ICD codes. Please note that because of the way ICD-9-CM codes are mapped to ICD-10-CM codes (https://www.cms.gov/medicare/coding-billing/icd-10-codes/icd-10-cm-icd-10-pcs-gem-archive or [waybackmachine snapshot of CMS 2018 ICD-10 CM and Gems](https://web.archive.org/web/20171115133352/https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html)), the calculated frequencies of CCCs may differ between corresponding ICD-9-CM and ICD-10-CM diagnosis codes for a decedent. + +The data documentation and instructions for direct download are available at: https://ftp.cdc.gov/pub/health_statistics/NCHs/Datasets/Comparability/icd9_icd10/ICD9_ICD10_comparability_file_documentation.pdf or [waybackmachine snapshot](https://web.archive.org/web/20250419060336/https://ftp.cdc.gov/pub/health_statistics/NCHs/Datasets/Comparability/icd9_icd10/ICD9_ICD10_comparability_file_documentation.pdf) or within this package via: + +```{r, eval = FALSE} +system.file("icd", "ICD9_ICD10_comparability_file_documentation.pdf", package = "pccc") +``` -The data documentation and instructions for direct download are available at: ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/comparability/icd9_icd10/ICD9_ICD10_comparability_file_documentation.pdf # Preparing the Data diff --git a/vignettes/pccc-overview.Rmd b/vignettes/pccc-overview.Rmd index 219325a..7509789 100644 --- a/vignettes/pccc-overview.Rmd +++ b/vignettes/pccc-overview.Rmd @@ -64,7 +64,7 @@ The `ccc` function is the workhorse here. Simply put, a user will provide ICD co ## Substring matching exceptions -Some datasets may contain different degrees of specificity of ICD-9-CM codes, which can lead to issues with substring matching for certain codes. For example, consider a patient with _Congenital hereditary muscular dystrophy_. The least specific ICD-9-CM code for _Muscular dystropy_ is 359, which is a CCC code. The exact ICD-9-CM code specifying _Congenital hereditary muscular dystrophy_ is 3590. Even when describing the same patient, one dataset may contain the 359 code while another dataset may contain the 3590 code. If we use substring matching logic above and match on 359, we would capture the patient in both datasets. However, we would also capture non-CCC diagnoses like 3594, _Toxic myopathy_. If we use substring matching logic and match on 3590, we would only capture the patient in the dataset with more specific ICD-9-CM codes. We address this problem by exact matching for less specific codes (e.g., the code 359 will match only if the dataset contains the 3-digit code 359) and substring matching for more specific codes (e.g., code 3590 will match any code _beginning with_ 3590). This approach improves the sensitivity of detecting CCCs in datasets with less specific codes (e.g. 359) and also reduces misclassification errors in datasets with more specific codes (e.g. 3590). +Some datasets may contain different degrees of specificity of ICD-9-CM codes, which can lead to issues with substring matching for certain codes. For example, consider a patient with _Congenital hereditary muscular dystrophy_. The least specific ICD-9-CM code for _Muscular dystrophy_ is 359, which is a CCC code. The exact ICD-9-CM code specifying _Congenital hereditary muscular dystrophy_ is 3590. Even when describing the same patient, one dataset may contain the 359 code while another dataset may contain the 3590 code. If we use substring matching logic above and match on 359, we would capture the patient in both datasets. However, we would also capture non-CCC diagnoses like 3594, _Toxic myopathy_. If we use substring matching logic and match on 3590, we would only capture the patient in the dataset with more specific ICD-9-CM codes. We address this problem by exact matching for less specific codes (e.g., the code 359 will match only if the dataset contains the 3-digit code 359) and substring matching for more specific codes (e.g., code 3590 will match any code _beginning with_ 3590). This approach improves the sensitivity of detecting CCCs in datasets with less specific codes (e.g. 359) and also reduces misclassification errors in datasets with more specific codes (e.g. 3590). **We have listed these exact match exceptions under their corresponding CCC category in the [pccc-icd-codes](pccc-icd-codes.html) description.**