-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to treat nested arrays? #48
Comments
A similar-ish case that may be worth considering here is arrays that have been improperly serialized to an object when there is only one element. I.e. JSON like: x <- '[{"id": 1, "list":[1,2,3]}, {"id": 2, "list": 4}]'
x %>% gather_array() %>%
spread_values(id=jnumber('id')) %>%
enter_object('list') %>%
json_types() While technically not valid, it may still be nice to have a way to work with it. The work-around solution here is the same - filtering on I also posted the workaround in an actual question someone had here |
Honestly, it seems all that is really needed here is a way to bypass the type-checking. The function itself already handles these cases fairly nicely when the type-check is removed. Not sure whether the better behavior is a parameter in the function or an environmental variable like By commenting out the type-checking lines in the x <- "[{\"id\": 1, \"list\":[1,2,3]}, {\"id\": 2, \"list\": 4}]"
x %>% gather_array() %>% enter_object("list") %>% json_types() %>%
gather_array("array.index2") %>%
json_types("type2")
#> # A tbl_json: 4 x 5 tibble with a "JSON" attribute
#> `attr(., "JSON")` document.id array.index type array.index2 type2
#> <chr> <int> <int> <fctr> <int> <fctr>
#> 1 1 1 1 array 1 number
#> 2 2 1 1 array 2 number
#> 3 3 1 1 array 3 number
#> 4 4 1 2 number 1 number
x <- "[[1, 2], 1]" %>% gather_array %>% json_types
x %>% gather_array("array.index2") %>% json_types("type2")
#> # A tbl_json: 3 x 5 tibble with a "JSON" attribute
#> `attr(., "JSON")` document.id array.index type array.index2 type2
#> <chr> <int> <int> <fctr> <int> <fctr>
#> 1 1 1 1 array 1 number
#> 2 2 1 1 array 2 number
#> 3 1 1 2 number 1 number Although perhaps it would be preferable for the |
The change above is very problematic for objects, for which keys are silently thrown away, so a better proposal is required... maybe a way to not touch '{"a":"one","b":"two","c":"three"}' %>%
gather_array() %>%
append_values_string()
## A tbl_json: 3 x 3 tibble with a "JSON" attribute
# `attr(., "JSON")` document.id array.index string
# <chr> <int> <int> <chr>
#1 "\"one\"" 1 1 one
#2 "\"two\"" 1 2 two
#3 "\"three\"" 1 3 three |
Nested arrays are difficult to work with. For example,
At this point, there is no way to gather the next array unless we filter on
type == 'array'
.append_values_number
works, but returnsNA
for the array, andrecursive = TRUE
doesn't work through the second level array. Further, it could be that the types are mixed.The text was updated successfully, but these errors were encountered: