Skip to content

v0.9.0

Compare
Choose a tag to compare
@matthewwardrop matthewwardrop released this 22 Aug 05:30

This is a significant feature release with breaking changes to the API.

Breaking changes:

  • DatabaseClient.push renamed to DatabaseClient.dataframe_to_table for increased consistency.
  • The local_cache Duct has been removed, in favour of a new filesystem_cache, which is built on the new Cache API. It can run on arbitrary filesystems (local, remote, etc), and is much more flexible.

Features and enhancements:

  • Database Clients:
    • Support for session properties for database clients (with implementations for Presto, Hive and PySpark backends).
    • Support for 'CREATE TABLE AS' statements via DatabaseClient.query_to_table.
    • Support for dropping tables via DatabaseClient.table_drop.
    • DatabaseClient.push renamed to DatabaseClient.dataframe_to_table for increased consistency.
  • Filesystems:
    • Support for removing files and directories via FileSystemClient.remove.
    • '~' prefix is now recognised on all filesystems to refer to the "home" path.
  • Cache:
    • API significantly refactored to be more flexible and intuitive, with better documentation.
    • LocalCache was replaced with FilesystemCache, which works atop any FileSystemClient backend.
  • General:
    • Support for disabling omniduct logging.

Behind the scenes:

  • Added some new unit tests for database backends. Much more work to be done here.
  • Streamlined some of the querying logic in DatabaseClient.
  • Namespace parsing for table-specific methods in DatabaseClient for better runtime validation.