Skip to content

Should we use a SparseFixedBitSet when deletes are sparse? #13084

@jpountz

Description

@jpountz

Description

@uschindler asked this question in https://lists.apache.org/thread/6o3hn3x8syfm8lj93kk5rrxb0kx701gp.

In this discussion, we were looking for introducing the ability to iterate deleted docs, in order to compute (cheaply!) some facets across the entire doc ID space, to then fix counts by iterating deleted docs and decrementing counts in buckets where they belong. Using a SparseFixedBitSet in the sparse case would help have a good iterator all the time, rather that requiring O(maxDoc) all the time because this is what FixedBitSet requires to iterate all clear bits.

If having sequential access on deletes wasn't a requirement, a set-based approach would work too.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions