The package is registered in the General
registry and so can be installed at the REPL with
] add GroupedArrays
.
GroupedArray
returns an AbstractArray
with integers corresponding to each group (or a missing
for groups with missing
).
using GroupedArrays
p = repeat(["a", "b", missing], outer = 2)
g = GroupedArray(p)
# 6-element GroupedArray{Int64, 1}:
# 1
# 2
# missing
# 1
# 2
# missing
Use the keyword argument coalesce = true
to consider missing values as distinct
using GroupedArrays
p = repeat(["a", "b", missing], outer = 2)
g = GroupedArray(p; coalesce = true)
# 6-element GroupedArray{Int64, 1}:
# 1
# 2
# 3
# 1
# 2
# 3
GroupedArray
can be used to compute groups across multiple vectors:
p1 = repeat(["a", "b"], outer = 3)
p2 = repeat(["d", "e"], inner = 3)
g = GroupedArray(p1, p2)
# 6-element GroupedArray{Int64, 1}:
# 1
# 2
# 1
# 3
# 4
# 3
Internally, a GroupedArray
is stored as a vector of Integers, where 0 corresponds to missing
.
The algorithm to construct GroupedArrays
is taken from DataFrames.jl