Skip to content

Allow ep.pp.encode to produce sparse output #650

Description

@Zethson

Description of feature

I looked a bit into sparse encoding. One-hot encoding being the most important:

  1. scikit-learn's one-hot encoding supports a sparse_output parameter that should return a CSR matrix.
  2. We're getting original_values as numpy arrays when calling the function. May or may not be fine.
  3. Currently we default the sparse_output parameter to False without checking the type of matrix.
  4. The _update_encoded_data does not take sparse matrices into account

Metadata

Metadata

Assignees

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions