Applies all relevent varification flags to an extracted dataitem. This includes:
Completeness: contributed data items match local capability (i.e. missingness only occurs when the data doesn't exist)
Uniqueness plausibility: descriptions of singular events/objects are not duplicated.
Atemporal plausibility: Events occur within their episode (within a reasonable buffer time). Events fall within an accepted range, follow the expected distribution and agree with internal/local knowledge. Repeated measures of the same event show the expected variability.
Temporal plausibility: value density over time are consistent with local expectations
verify_events(x, los_table = NULL)
x | extracted dataitem from |
---|---|
los_table | episode length table from |
a tibble with verification applied
Other varification components are found elsewhere, as they don't necessarily fit into an evaludation at the data item level. I am contemplating how to unify this procedure.
## DB Connection db_pth <- system.file("testdata/synthetic_db.sqlite3", package = "inspectEHR") ctn <- connect(sqlite_file = db_pth) ## Pre-requisites core <- make_core(ctn) episode_length <- characterise_episodes(ctn) ve <- verify_episodes(episode_length) ## Data item extraction hr <- extract(core, input = "NIHR_HIC_ICU_0108") ## Full varification vhr <- verify_events(hr, ve) head(vhr)#> # A tibble: 6 x 11 #> episode_id periodicity event_id site code_name datetime value range_error #> <int> <dbl> <int> <chr> <chr> <chr> <int> <int> #> 1 13626 20.0 2412851 B NIHR_HIC… 2016-02… 69 0 #> 2 13626 20.0 2412852 B NIHR_HIC… 2016-02… 93 0 #> 3 13626 20.0 2412853 B NIHR_HIC… 2016-02… 86 0 #> 4 13626 20.0 2412854 B NIHR_HIC… 2016-02… 93 0 #> 5 13626 20.0 2412855 B NIHR_HIC… 2016-02… 63 0 #> 6 13626 20.0 2412856 B NIHR_HIC… 2016-02… 102 0 #> # … with 3 more variables: out_of_bounds <int>, duplicate <int>, var_per <int>