v4.0.0
API changes, new features:
- Dataset-as-folder: Dataset can now be self-contained module in a folder with checksums, dummy data,... This simplify implementing datasets outside the TFDS repository.
tfds.loadcan now load dataset without using the generation class. Sotfds.load('my_dataset:1.0.0')can work even ifMyDataset.VERSION == '2.0.0'(See #2493).- Add a new TFDS CLI (see https://www.tensorflow.org/datasets/cli for detail)
tfds.testing.mock_datadoes not require metadata files anymore!- Add
tfds.as_dataframe(ds, ds_info)with custom visualisation (example) - Add
tfds.even_splitsto generate subsplits (e.g.tfds.even_splits('train', n=3) == ['train[0%:33%]', 'train[33%:67%]', ...] - Add new
DatasetBuilder.RELEASE_NOTESproperty - tfds.features.Image now supports PNG with 4-channels
tfds.ImageFoldernow supports custom shape, dtype- Downloaded URLs are available through
MyDataset.url_infos - Add
skip_prefetchoption totfds.ReadConfig as_supervised=Truesupport fortfds.show_examples,tfds.as_dataframe
Breaking compatible changes:
tfds.as_numpy()now returns an iterable which can be iterated multiple times. To migratenext(ds)->next(iter(ds))- Rename
tfds.features.text.Xyz->tfds.deprecated.text.Xyz - Remove
DatasetBuilder.IN_DEVELOPMENTproperty - Remove
tfds.core.disallow_positional_args(should use Py3*,instead) - tfds.features can now be saved/loaded, you may have to overwrite FeatureConnector.from_json_content and
FeatureConnector.to_json_contentto support this feature. - Stop testing against TF 1.15. Requires Python 3.6.8+.
Other bug fixes:
- Better archive extension detection for
dl_manager.download_and_extract - Fix
tfds.__version__in TFDS nightly to be PEP440 compliant - Fix crash when GCS not available
- Script to detect dead-urls
- Improved open-source workflow, contributor guide, documentation
- Many other internal cleanups, bugs, dead code removal, py2->py3 cleanup, pytype annotations,...
And of course, new datasets, datasets updates.
A gigantic thanks to our community which has helped us debugging issues and with the implementation of many features, especially vijayphoenix@ for being a major contributor.