Releases · neuml/txtai

03 Feb 11:55

davidmezzetti

v4.1.0

0770615

v4.1.0

This release adds the following new features, improvements and bug fixes.

New Features

Add entity extraction pipeline (#203)
Add workflow scheduling (#206)
Add workflow search task to API (#210)
Add Console Task (#215)
Add Export Task (#216)
Add notebook for workflow scheduling (#218)

Improvements

Default documentation theme using system preference (#197)
Improve multi-user experience for workflow application (#198)
Documentation improvements (#200)
Add social preview image for documentation (#201)
Add links to txtai in all example notebooks (#202)
Add limit parameter to API search method (#208)
Add documentation on local API instances (#209)
Add shorthand syntax for creating workflow tasks in API (#211)
Accept functions as workflow task actions in API (#213)

Bug Fixes

Object detection model fails to load additional models (#204)
Update unit tests to limit cpu usage for word vector tests (#207)
Add better error handling around unindexed embedding instances (#212)
Fix issue when workflow task generates no output (#214)
Add lock to API search methods (#217)

Assets 2

11 Jan 12:23

davidmezzetti

v4.0.0

841786b

v4.0.0

🎈🎉🥳 We're excited to announce the release of txtai 4.0! 🥳🎉🎈

Thank you to the growing txtai community. This couldn't be done without you. Please remember to ⭐ txtai if it has been helpful.

txtai 4.0 is a major release with a significant number of new features. This release adds content storage, querying with sql, object storage, reindexing, index compression, external vectors and more!

To quantify the changes, the code base increased by 50% with 36 resolved issues, by far the biggest release of txtai. These changes were designed to be fully backward compatible but keep in mind it is a new major release.

What's new in txtai 4.0 covers all the changes with detailed examples. The documentation site has also been refreshed.

New Features

Store text content (#168)
Add option to index dictionaries of content (#169)
Add SQL support for generating combined embeddings + database queries (#170)
Add reindex method to embeddings (#171)
Add index archive support (#172)
Add close method to embeddings (#173)
Update API to work with embeddings + database search (#176)
Add content option to tabular pipeline (#177)
Update workflow example to support embeddings content (#179)
Add index metadata to embeddings config (#180)
Add object storage (#183)
Aggregate partial query results when clustering (#184)
Add function parameter to embeddings reindex (#185)
Add support for user defined column aliases (#186)
Use SQL bracket notation to support multi word and more complex JSON path expressions (#187)
Support SQLite 3.22+ (#190)
Add pre-computed vector support (#192)
Change document/object inserts to only keep latest record (#193)
Update documentation with 4.0 changes (#196)

Improvements

Modify workflow to select batches with slices (#158)
Add tensor support to workflows (#159)
Read YAML config if provided as a file path (#162)
Make adding pipelines to API easier (#163)
Process task actions concurrently (#164)
Add tensor workflow notebook (#167)
Update default ANN parameters (#174)
Require Python 3.7+ (#175)
Consistently name embeddings id fields (#178)
Add txtai version attribute (#181)
Refresh notebooks for 4.0 (#188)
Modify embeddings to only iterate over input documents once (#189)
Improve efficiency of vector transformations (#191)

Bug Fixes

Add thread lock around API write calls (#160)
Expose caption and objects pipeline via API (#161)
Change pickle calls to use protocol supporting lowest Python version (#182)
HFOnnx expects ORT provider bug (#195)

Assets 2

23 Nov 01:28

davidmezzetti

v3.7.0

01b4ca0

v3.7.0

This release adds the following new features, improvements and bug fixes.

New Features

Add object detection pipeline (#148)
Add image caption pipeline (#149)
Add retrieval task (#150)
Add no-op pipeline (#152)
Add new workflow functionality (#155)

Improvements

Add korean translation to README.md. Thank you @0206pdh! (#138)
Add links to external articles (#139)
Update example applications to be consistent (#140)
Add an article summarization example (#144)
Add fallback mode for textractor (#145)
Reorganize pipeline package (#147)
Update optional package tests to simulate missing packages (#154)
Add parameter to flatten labels output (#153)
Update documentation with latest changes (#156)

Bug Fixes

Fix bug with importing service task when workflow extra not installed (#146)
Fix inconsistencies with url based tasks (#151)

Contributors

0206pdh

Assets 2

08 Nov 17:43

davidmezzetti

v3.6.0

b76f27b

v3.6.0

This release adds the following new features, improvements and bug fixes.

New Features

Add post workflow action to API (#129)
Add tabular pipeline (#134)
Enhance ServiceTask to support additional use cases (#135)
Add notebook for tabular pipeline (#136)
Add topn option to extractor pipeline (#137)

Improvements

Refactor registering new auto models to use methods in Transformers library (#128)
Update workflow example application (#130)

Bug Fixes

No issues this release

Assets 2

18 Oct 11:14

davidmezzetti

v3.5.0

6907aaa

v3.5.0

This release adds the following new features, improvements and bug fixes.

New Features

Add scikit-learn to ONNX export pipeline (#124)
Add registry methods for auto models (#126)
Add notebook to demonstrate loading scikit-learn and PyTorch models (#127)

Improvements

Add parameter to return raw model outputs for labels pipeline (#123)
Add parameter to use standard pooling for TransformersVectors (#125)

Bug Fixes

Pass model configuration to ONNX Models (#121)
Fix incorrect import in Notebooks (#122)

Assets 3

07 Oct 16:46

davidmezzetti

v3.4.0

71783c2

v3.4.0

This release adds the following new features, improvements and bug fixes.

New Features

Create notebook using extractive qa to build structured data (#117)
Modify extractor pipeline to support similarity pipeline backed context (#119)

Improvements

Improve performance of extractor context queries (#120)

Bug Fixes

Update labels pipeline to filter text classification output (#116)
Fix issues with Transformers 4.11.2 (#118)

Assets 2

10 Sep 17:34

davidmezzetti

v3.3.0

652a494

v3.3.0

This release adds the following new features, improvements and bug fixes.

New Features

Add ONNX export pipeline (#107)
Add notebook for ONNX pipeline (#108)
Add ONNX support for Embeddings and Pipelines (#109)
Support QA models in Trainer pipeline (#111)
Add notebook for training QA models (#115 )

Improvements

Remove deprecated packages (#114)

Bug Fixes

Fix issues with latest Transformers version (#110)

Assets 2

17 Aug 23:52

davidmezzetti

v3.2.0

0e49664

v3.2.0

This release adds the following new features, improvements and bug fixes.

New Features

Enhance Labels pipeline to support standard text classification models (#95)
Add Trainer pipeline (#96)
Modularize txtai install (#97)
Evaluate if faiss-cpu can be used as default across all platforms (#98)
Add vector method for sentence-transformers (#101)

Improvements

Add book search example application (#91)
Add wiki search example application (#92)
Change tokenization to default to false for TransformerVectors (#99)
Infer vector method using path (#100)
Improve performance when running models through transformers (#102)
Update notebooks and example applications (#103)

Bug Fixes

Clear workflow batch during processing bug (#90)

Assets 2

22 May 11:04

davidmezzetti

v3.1.0

b9e84d9

v3.1.0

This release adds the following new features:

Add support for update/delete embeddings index operations (#86)
Add Embeddings Cluster component (#87)
Switch default backend on Windows to Hnswlib (#88)
Add notebook covering distributed embedding clusters (#89)

Assets 2

04 May 19:17

davidmezzetti

v3.0.0

a3ec37a

v3.0.0

txtai 3.0.0 is a major release with a significant number of new features. This release overhauls the project structure, consolidates logic into pipelines and introduces workflows.

Summary of txtai features:

🔎 Large-scale similarity search with multiple index backends (Faiss, Annoy, Hnswlib)
📄 Create embeddings for text snippets, documents, audio and images. Supports transformers and word vectors.
💡 Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction
↪️️ Workflows that join pipelines together to aggregate business logic. txtai processes can be microservices or full-fledged indexing workflows.
🔗 API bindings for JavaScript, Java, Rust and Go
☁️ Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes)

New Features

Add Docker file for API (#59)
Require Faiss 1.7.0 (#60)
Add summary pipeline (#65)
Add text extraction pipeline (#66)
Add transcription pipeline (#67)
Add translation pipeline (#68)
Add workflow framework (#69)
Add additional pipeline abstraction layer for tensor frameworks (#70)
Add tests for new v3 functionality (#71)
Add notebooks covering new v3 functionality (#73)
Add Pipeline Factory (#76)
Add API extensions (#77)
Add workflow builder application (#80)
Add text segmentation pipeline (#81)
Add workflow to API (#82)
Add service workflow task (#83)
Add object storage workflow task (#84)
Add URL workflow task (#85)

Improvements

Refactor code into smaller components and modules (#63)
Modify pipeline to accept GPU device id (#64)
Allow direct download of sentence-transformer models (#72)
Update documentation, add site through GitHub pages (#75)
Modularize the API (#78)
Add default truncation to pipelines (#79)

Bug Fixes

Non intuitive behaviour of Tokenizer (#61)
[Python 3.9, Mac OS] Code hangs while building embedding index (#62)
embeddings.index Truncation RuntimeError: The size of tensor a (889) must match the size of tensor b (512) at non-singleton dimension 1 (#74)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features

Improvements

Bug Fixes

🎈🎉🥳 We're excited to announce the release of txtai 4.0! 🥳🎉🎈

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

Contributors

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

Releases: neuml/txtai

v4.1.0

New Features

Improvements

Bug Fixes

v4.0.0

🎈🎉🥳 We're excited to announce the release of txtai 4.0! 🥳🎉🎈

New Features

Improvements

Bug Fixes

v3.7.0

New Features

Improvements

Bug Fixes

Contributors

v3.6.0

New Features

Improvements

Bug Fixes

v3.5.0

New Features

Improvements

Bug Fixes

v3.4.0

New Features

Improvements

Bug Fixes

v3.3.0

New Features

Improvements

Bug Fixes

v3.2.0

New Features

Improvements

Bug Fixes

v3.1.0

v3.0.0

New Features

Improvements

Bug Fixes