Releases: airbnb/omniduct
v0.9.3
This is a minor release with bugfixes to the dataframe_to_table
method. In particular, DatabaseClient._dataframe_to_table
was not updated to respect the new ParsedNamespaces
datastructure.
v0.9.2
This is a minor release the introduces several improvements to the existing library.
Features and enhancements:
- Added two new serializers for use with the new cache API: BytesSerializer and PandasSerializer
- Added a
cache_namespace
attribute to allDuct
instances, and use it (if provided) as the default cache namespace for thecached_method
constructor. - Added support for prepare-time lookup of cache filesystem from registry, and checks to ensure filesystem is connected before performing cache storage operations.
Bugfixes and cleanups:
- Fixed the default serializer in
cached_method
not being instantiated. - Removed the (now unused) serialization code from CursorFormatters.
v0.9.1
This is a minor release that includes one new feature to smooth the migration from v0.8.7 to v0.9.x.
FilesystemCache
instances will now verify that the nominated cache directory is initially empty, and then persist a configuration file to mark the directory as initialised. This configuration will later be used to handle resource restrictions/etc. This should allow easier migration from older version of omniduct where existing cache files may clash with the structure used by the new cache structure.
v0.9.0
This is a significant feature release with breaking changes to the API.
Breaking changes:
DatabaseClient.push
renamed toDatabaseClient.dataframe_to_table
for increased consistency.- The
local_cache
Duct has been removed, in favour of a newfilesystem_cache
, which is built on the newCache
API. It can run on arbitrary filesystems (local, remote, etc), and is much more flexible.
Features and enhancements:
- Database Clients:
- Support for session properties for database clients (with implementations for Presto, Hive and PySpark backends).
- Support for 'CREATE TABLE AS' statements via
DatabaseClient.query_to_table
. - Support for dropping tables via
DatabaseClient.table_drop
. DatabaseClient.push
renamed toDatabaseClient.dataframe_to_table
for increased consistency.
- Filesystems:
- Support for removing files and directories via
FileSystemClient.remove
. - '~' prefix is now recognised on all filesystems to refer to the "home" path.
- Support for removing files and directories via
- Cache:
- API significantly refactored to be more flexible and intuitive, with better documentation.
LocalCache
was replaced withFilesystemCache
, which works atop anyFileSystemClient
backend.
- General:
- Support for disabling omniduct logging.
Behind the scenes:
- Added some new unit tests for database backends. Much more work to be done here.
- Streamlined some of the querying logic in
DatabaseClient
. - Namespace parsing for table-specific methods in
DatabaseClient
for better runtime validation.
v0.8.7
This is a minor release with one bugfix:
Stop iteration after exhausting contents of a file when proxied via FileSystemFile.
v0.8.6
This is a minor release with one bugfix:
Fix typo that made indeterminate progress bars the default.
v0.8.5
This is a minor point release with one improvement:
Allow use of dependencies which are not actually installed but simply available on the Python path.
v0.8.4
This release exposes the new PySpark client by default through the Duct registry.
v0.8.3
This release adds a database backend atop of PySpark.
v0.8.2
This release changes the behaviour of AWS logins. In particular, opinel
is only used if explicitly requested. This makes the behaviour of omniduct
more predictable compared to direct use of boto3
, etc.