Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets) Add DistributionPartitioner to Flower Datasets #3791

Merged
merged 78 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
1916a5c
Initial commit
chongshenng Jul 12, 2024
89acaf8
Add partition_id_to_samples
chongshenng Jul 12, 2024
5051829
Push working commit
chongshenng Jul 12, 2024
350d9b4
Refactor
chongshenng Jul 12, 2024
f7c705f
Refactor
chongshenng Jul 12, 2024
9d2785b
Add data structure for storing indices
chongshenng Jul 12, 2024
4e7fa4a
Refactor and rename variables
chongshenng Jul 12, 2024
55ae55b
Rename tracker_dict to index_tracker
chongshenng Jul 12, 2024
cb99ebe
Update comments
chongshenng Jul 12, 2024
69fbec2
Refactor checks for input distribution array
chongshenng Jul 12, 2024
4243830
Update class docstring
chongshenng Jul 12, 2024
994f319
Disable too many arguments warning
chongshenng Jul 12, 2024
7d8d952
Update docstring example
chongshenng Jul 12, 2024
3c29514
Add reshape to docstring
chongshenng Jul 12, 2024
e227c93
Sort imports
chongshenng Jul 12, 2024
20931d9
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 12, 2024
9bdf17b
Add warning for num_partitions not divisble by num_unique_labels
chongshenng Jul 12, 2024
a2dd302
Add check for distribution array sum
chongshenng Jul 12, 2024
cd2b3cf
Fix missing sum
chongshenng Jul 12, 2024
42bb71a
Sort imports
chongshenng Jul 12, 2024
adf4ebc
Fix docstrings with docformatter
chongshenng Jul 15, 2024
e71e69e
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 15, 2024
4a86966
Fix docstring
chongshenng Jul 15, 2024
1277f8e
Fix docstring
chongshenng Jul 15, 2024
046ed33
Add newline
chongshenng Jul 15, 2024
b20c4db
Refactor ndarray typing
chongshenng Jul 15, 2024
d5011df
Fix pylint
chongshenng Jul 15, 2024
ed90e97
Update docstring
chongshenng Jul 15, 2024
b1450bb
Update top level docstring
chongshenng Jul 15, 2024
226b676
Use common typing
chongshenng Jul 15, 2024
2abeb21
Add distribution partitioner test
chongshenng Jul 15, 2024
a989b17
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 15, 2024
0618de9
Lint
chongshenng Jul 15, 2024
b7a8c12
Change | to Union
chongshenng Jul 15, 2024
444bca6
Lint test
chongshenng Jul 15, 2024
fe77fe2
Disable no-self-use (R0201)
chongshenng Jul 15, 2024
267ecda
Refactor typing to avoid depending on flwr
chongshenng Jul 15, 2024
bbbd817
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 15, 2024
2b1fc7d
Update documentation for DistributionPartitioner
chongshenng Jul 15, 2024
6235372
Add expression for distribution partitioner
chongshenng Jul 17, 2024
dd45c8e
Add test to check total preassigned samples
chongshenng Jul 17, 2024
b442e6f
Add test to check total preassigned samples
chongshenng Jul 17, 2024
a31d6da
Add test for number of unique labels per partition
chongshenng Jul 17, 2024
e0165bf
Add specific checks to distribution array shape
chongshenng Jul 17, 2024
06ae94d
Update num partitions explanation
chongshenng Jul 17, 2024
2501830
Add numpy import to docstring
chongshenng Jul 17, 2024
735ea04
Add exhaustive sampling when rescaling
chongshenng Jul 17, 2024
e89bf01
Update docstring for exhaustive rescaling
chongshenng Jul 17, 2024
59ddae3
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 17, 2024
044e19a
Fix typing
chongshenng Jul 17, 2024
de66ebb
Fix bad type
chongshenng Jul 17, 2024
e0e4866
Add rescale condition for exhaustive sampling
chongshenng Jul 17, 2024
e71a18b
Lint
chongshenng Jul 17, 2024
8c0c64e
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 17, 2024
c9708c3
Address preassigned value docstring
chongshenng Jul 18, 2024
9e11025
Align variable names to parameter names
chongshenng Jul 18, 2024
e177c7a
Explain 2nd dimension
chongshenng Jul 18, 2024
1dd6e29
Fix typo for alpha*, explain partition_id sorting
chongshenng Jul 18, 2024
a744681
Add alternate distribution description
chongshenng Jul 18, 2024
dff0d15
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 18, 2024
4a726bc
Clarify index sorting
chongshenng Jul 19, 2024
afa3730
Lint
chongshenng Jul 19, 2024
9744bfe
Mid change
chongshenng Jul 22, 2024
2d14200
Change pytest to unittest implementation
chongshenng Jul 22, 2024
496e96e
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 22, 2024
0a90076
Use parameterized_class
chongshenng Jul 22, 2024
d9a21d6
Merge branch 'main' into fds-add-distribution-partitioner
chongshenng Jul 22, 2024
791aa6a
Disable mypy attr-defined error code
chongshenng Jul 22, 2024
128d818
Update datasets/flwr_datasets/partitioner/distribution_partitioner_te…
jafermarq Jul 22, 2024
8cd2aaf
Merge branch 'main' into fds-add-distribution-partitioner
jafermarq Jul 23, 2024
43b26ca
Apply suggestions from code review
jafermarq Jul 23, 2024
145dfe6
format
jafermarq Jul 23, 2024
5ee5a13
Address comments
chongshenng Jul 23, 2024
b9b308c
sort
jafermarq Jul 24, 2024
99ce809
Merge branch 'main' into fds-add-distribution-partitioner
jafermarq Jul 24, 2024
7f96adf
Merge branch 'main' into fds-add-distribution-partitioner
jafermarq Jul 24, 2024
7afb402
Merge branch 'main' into fds-add-distribution-partitioner
jafermarq Jul 24, 2024
c535ca7
Merge branch 'main' into fds-add-distribution-partitioner
jafermarq Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ Create **custom partitioning schemes** or choose from the **implemented [partiti
* Partitioner (the abstract base class) `Partitioner`
* IID partitioning `IidPartitioner(num_partitions)`
* Dirichlet partitioning `DirichletPartitioner(num_partitions, partition_by, alpha)`
* Distribution partitioning `DistributionPartitioner(distribution_array, num_partitions, num_unique_labels_per_partition, partition_by, preassigned_num_samples_per_label, rescale)`
* InnerDirichlet partitioning `InnerDirichletPartitioner(partition_sizes, partition_by, alpha)`
* Pathological partitioning `PathologicalPartitioner(num_partitions, partition_by, num_classes_per_partition, class_assignment_mode)`
* Natural ID partitioning `NaturalIdPartitioner(partition_by)`
Expand Down
1 change: 1 addition & 0 deletions datasets/doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ Here are a few of the ``Partitioner`` s that are available: (for a full list see
* Partitioner (the abstract base class) ``Partitioner``
* IID partitioning ``IidPartitioner(num_partitions)``
* Dirichlet partitioning ``DirichletPartitioner(num_partitions, partition_by, alpha)``
* Distribution partitioning ``DistributionPartitioner(distribution_array, num_partitions, num_unique_labels_per_partition, partition_by, preassigned_num_samples_per_label, rescale)``
* InnerDirichlet partitioning ``InnerDirichletPartitioner(partition_sizes, partition_by, alpha)``
* PathologicalPartitioner ``PathologicalPartitioner(num_partitions, partition_by, num_classes_per_partition, class_assignment_mode)``
* Natural ID partitioner ``NaturalIdPartitioner(partition_by)``
Expand Down
2 changes: 2 additions & 0 deletions datasets/flwr_datasets/partitioner/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@


from .dirichlet_partitioner import DirichletPartitioner
from .distribution_partitioner import DistributionPartitioner
from .exponential_partitioner import ExponentialPartitioner
from .iid_partitioner import IidPartitioner
from .inner_dirichlet_partitioner import InnerDirichletPartitioner
Expand All @@ -29,6 +30,7 @@

__all__ = [
"DirichletPartitioner",
"DistributionPartitioner",
"ExponentialPartitioner",
"IidPartitioner",
"InnerDirichletPartitioner",
Expand Down
Loading