Releases · neuml/txtai

20 Dec 15:12

v5.2.0

7f53458

v5.2.0

This release adds TextToSpeech and Cross-Encoder pipelines. The performance of the embeddings.batchtransform method was significantly improved, enabling a speed up in building semantic graphs. Default configuration is now available for Embeddings, allowing an Embeddings instance to be created with no arguments like Pipelines.

See below for full details on the new features, improvements and bug fixes.

New Features

Add Cross-Encoder support to Similarity pipeline (#372)
Create compression package (#376)
Add TextToSpeech pipeline (#389)
Add TextToSpeech Notebook (#391)
Add default configuration for Embeddings (#393)

Improvements

Filter HF API list models request (#381)
Split pipeline extras by function area (#387)
Update data package to handle label arrays (#388)
Modify transcription pipeline to accept raw waveform data (#390)
Transcription pipeline improvements (#392)
Allow searching by embedding (#396)
Modified logger configuration in init.py (libraries shouldn't modify root logger) - Thank you @adin786! (#397)
Pass evaluation metrics to underlying Trainer (#398)
Improve batchtransform performance (#399)

Bug Fixes

Example 31 - Duplicate image detection not working (#357)
All sorts of issues with Example 18 - Export and run models with ONNX (#369)
Fix issue with select distinct bug (#379)
Update build script and tests to address issues with latest version of FastAPI (#380)
Fix issue with similar and bracket SQL expressions embedded in functions (#382)
Fix bug with embeddings functions and application config bug (#400)

Contributors

adin786

Assets 2

18 Oct 14:13

davidmezzetti

v5.1.0

35be452

v5.1.0

This release adds new model support for the translation pipeline, OpenAI Whisper support in the transcription pipeline and ARM Docker images. Topic modeling was also updated with improvements, including how to use BM25/TF-IDF indexes to drive topic models.

See below for full details on the new features, improvements and bug fixes.

New Features

Multiarch docker image (#324)
Add notebook covering classic topic modeling with BM25 (#360)

Improvements

Read authentication parameters from storage task (#332)
Update scoring algorithms (#351)
Add config option for list of stopwords to ignore with topic generation (#352)
Allow for setting custom translation model path (#355)
Update caption pipeline to call image-to-text pipeline (#361)
Update transcription pipeline to call automatic-speech-recognition pipeline (#362)
Only pass tokenizer to pipeline when necessary (#363)
Improve default max length logic for text generation (#364)
Update transcription notebook (#365)
Update translation notebook (#366)
Move mkdocs dependencies from docs.yml to setup.py (#368)

Bug Fixes

GitHub Actions build error with torch 1.12 on macOS (#300)
SQLite JSON support not built into Python Windows builds < 3.9 (#356)
Use tags field in application.add (#359)
Fix issue with Application autosequencing (#367)

Assets 2

27 Sep 15:11

davidmezzetti

v5.0.0

fa5ad86

v5.0.0

🎈🎉🥳 We're excited to announce the release of txtai 5.0! 🥳🎉🎈

Thank you to the txtai community! Please remember to ⭐ txtai!

txtai 5.0 is a major new release. This release adds the semantic graph along with enabling external integrations. It also adds a number of improvements and bug fixes.

New Features

Add scoring-based search (#327)
Add notebook demonstrating functionality of individual embeddings components (#328)
Add SQL expression columns (#338)
Add semantic graph component (#339)
Add notebook covering Semantic Graphs (#341)
Add graph documentation (#343)
Allow custom ann, database and graph instances (#344)

Improvements

Clarify embeddings.save documentation (#325)
Modify embeddings search candidate default logic (#326)
Update console to conditionally import library (#333)
Update ANN package to make terminology more consistent (#334)
Support non-text document elements in Applications (#335)
Update workflow documentation to note generator execution (#336)
Update audio transcription notebook to include example with OpenAI Whisper (#345)

Bug Fixes

Calling scoring.index with no tokens parsed results in error (#337)
Fix cached_path error with transformers v4.22 (#340)
Fix docker command "--it". Thank you to @lipusz! (#346)
Error loading compressed indexes in console bug (#347)

Contributors

lipusz

Assets 2

15 Aug 14:23

davidmezzetti

v4.6.0

7642d6c

v4.6.0

🎈🎉🥳 txtai turns 2 🎈🎉🥳

We're excited to release the 25th version of txtai marking it's 2 year anniversary. Thank you to the txtai community. Please remember to ⭐ txtai!

txtai 4.6 is a large but backwards compatible release! This release adds better integration between embeddings and workflows. It also adds a number of significant performance improvements and bug fixes.

New Features

Add transform workflow action to application (#281)
Add ability to resolve workflows within applications (#290)
OFFSET in sql query statement (#293)
Add webpage summary image generation notebook (#299)
Add notebook on running txtai with native code (#304)
Add mmap parameter to Faiss (#308)
Add indexing guide to docs (#312)

Improvements

Consume generator outputs in workflow tasks (#291)
Update pipeline workflow notebook (#292)
Update tabular notebook (#297)
Lower required version of Pillow library to prevent unnecessary upgrades (#303)
Embeddings vector batch improvements (#309)
Use single constant for current pickle protocol (#310)
Move quantize config param to Faiss (#311)
Update documentation with new demo and diagrams (#313)
Improve embeddings performance with large query limits (#318)

Bug Fixes

ModuleNotFoundError: No module named 'transformers.hf_api' (#274)
Dependency issue with ONNX and Protobuf (#285)
The key should be writable instead of path. Thank you to @csnelsonchu! (#287)
Fix breaking change in build script from mkdocstrings bug (#289)
Index id sync issue when inserting multiple data types (text, documents, objects) into Embeddings (#294)
Labels pipeline outputs changed with transformers 4.20.0 (#295)
Tabular pipeline throws error when processing list fields (#296)
txtai load testing (#305)
Add cloud config to application.upsert method (#306)

Contributors

csnelsonchu

Assets 2

17 May 13:52

davidmezzetti

v4.5.0

248acc1

v4.5.0

This release adds the following new features, improvements and bug fixes.

New Features

Add scripts to train bashsql query translation model (#271)
Add QA database example notebook (#272)
Add CITATION file (#273)

Improvements

Improve efficiency of external vectors (#275)
Refactor vectors package to improve code reuse (#276)
Add logic to detect external vectors method (#277)

Bug Fixes

Fix summary pipeline issue with transformers>=4.19.0 (#278)

Assets 2

20 Apr 14:21

davidmezzetti

v4.4.0

bb158d5

v4.4.0

This release adds the following new features, improvements and bug fixes.

New Features

Add semantic search explainability (#248)
Add notebook covering model explainability (#249)
Add txtai console (#252)
Add sequences pipeline (#261)
Add scripts to train query translation models (#265)
Add query translation logic in embeddings searches (#266)
Add notebook for query translation (#269)

Improvements

Update HFTrainer to support sequence-sequence models (#262)

Bug Fixes

Unit tests failing with tokenizers>= 0.12 (#253)
Running default.config.yml returns TypeError: register() got an unexpected keyword argument 'ids' (#256)
Unit tests failing with transformers==4.18.0 (#258)
Update precommit to use latest version of psf black (#259)

Assets 2

11 Mar 20:14

davidmezzetti

v4.3.1

4856956

v4.3.1

This release adds the following new features, improvements and bug fixes.

Bug Fixes

Fix word embeddings regression with batch transformation (#245)

Assets 2

10 Mar 00:02

davidmezzetti

v4.3.0

2bd3bf8

v4.3.0

This release adds the following new features, improvements and bug fixes.

New Features

Add notebook covering txtai embeddings index file structure (#237)
Add Image Hash pipeline (#240)
Add support for custom SQL functions in embeddings queries (#241)
Add notebook for Embeddings SQL functions (#243)
Add notebook for near-duplicate image detection (#244)

Improvements

Rename SQLException to SQLError (#232)
Refactor API instance into a separate package (#233)
API should raise an error if attempting to modify a read-only index (#235)
Add last update field to index metadata (#236)
Update transcription pipeline to use AutoModelForCTC (#238)

Bug Fixes

Ensure limit always set in embeddings search/batchsearch (#234)
Fix issue with parsing multiline SQL statements bug (#242)

Assets 2

28 Feb 01:05

davidmezzetti

v4.2.1

5167422

v4.2.1

This release adds the following new features, improvements and bug fixes.

Bug Fixes

Fixed mislabeled API config definition (#231)

Assets 2

24 Feb 11:47

davidmezzetti

v4.2.0

6d12b5e

v4.2.0

This release adds the following new features, improvements and bug fixes.

New Features

Add notebook for workflow notifications (#225)
Add default and custom docker configurations (#226)
Create docker configuration for AWS Lambda (#228)
Add support for loading/storing embedding indexes on cloud storage (#229)

Improvements

Add support for SQL || operator (#223)
Add flag to disable loading index data in API (#230)

Bug Fixes

Modify database decoder methods to check for None (#220)
Modify embeddings search to make return type consistent when index initialized and not initialized (#221)
Embeddings index returning malformed JSON errors in certain situations (#222)
Check for empty documents input before indexing (#224)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features

Improvements

Bug Fixes

Contributors

New Features

Improvements

Bug Fixes

🎈🎉🥳 We're excited to announce the release of txtai 5.0! 🥳🎉🎈

New Features

Improvements

Bug Fixes

Contributors

🎈🎉🥳 txtai turns 2 🎈🎉🥳

New Features

Improvements

Bug Fixes

Contributors

New Features

Improvements

Bug Fixes

New Features

Improvements

Bug Fixes

Bug Fixes

New Features

Improvements

Bug Fixes

Bug Fixes

New Features

Improvements

Bug Fixes

Releases: neuml/txtai

v5.2.0

New Features

Improvements

Bug Fixes

Contributors

v5.1.0

New Features

Improvements

Bug Fixes

v5.0.0

🎈🎉🥳 We're excited to announce the release of txtai 5.0! 🥳🎉🎈

New Features

Improvements

Bug Fixes

Contributors

v4.6.0

🎈🎉🥳 txtai turns 2 🎈🎉🥳

New Features

Improvements

Bug Fixes

Contributors

v4.5.0

New Features

Improvements

Bug Fixes

v4.4.0

New Features

Improvements

Bug Fixes

v4.3.1

Bug Fixes

v4.3.0

New Features

Improvements

Bug Fixes

v4.2.1

Bug Fixes

v4.2.0

New Features

Improvements

Bug Fixes