shardpack

ShardPack is a storage format for large-scale datasets designed to solve common data pipeline bottlenecks by reducing the amount of required I/O operations. It provides random access reads, efficient partial data loading, and comprehensive metadata handling through a fast implementation that integrates with Python.

Key features:

[✅] Direct data access without archive parsing
[✅] Random-access and partial reads
[✅] Self-describing dataset organization
[✅] Fast compression support
[🛠️] FHE support
[🛠️] Distributed training support

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
docs		docs
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shardpack

Key features:

About

Releases

Packages

Languages

jsam/shardpack

Folders and files

Latest commit

History

Repository files navigation

shardpack

Key features:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages