fix: prefer native sparse embeddings when available by rogernogueira · Pull Request #7431 · agno-agi/agno

rogernogueira · 2026-04-08T20:57:16Z

Summary

This PR fixes Qdrant hybrid retrieval behavior by preferring embedder-native sparse vectors when available, while preserving the existing FastEmbed BM25 fallback.

fixes #7432

Problem

Currently, the Agno Qdrant integration always instantiates FastEmbed's SparseTextEmbedding (BM25) when search_type=hybrid`, even when the configured embedder already provides sparse vectors natively.

This creates two issues:

hybrid retrieval may combine dense vectors from the configured embedder with sparse vectors from a different representation space
FastEmbed becomes an unnecessary dependency for embedders that already support sparse output

A real example is bge-m3, which can produce sparse vectors natively.

Type of change

Summary of Changes

The system now checks for native sparse embedding support first. If found, it skips the FastEmbed setup to save resources.
Added _get_sparse_vector(), which prioritizes native embeddings and only uses FastEmbed as a fallback.
Both hybrid and keyword search paths now use this new logic to retrieve sparse vectors.
Updated _build_search_results to handle custom metadata (like doc_id) without triggering KeyError crashes.
Added a cookbook example showing a custom embedder with native sparse vectors in Qdrant hybrid search.
Updated document insert/upsert paths to use the same native-sparse-first resolution as search.

Checklist

Code complies with style guidelines
Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
Self-review completed
Documentation updated (comments, docstrings)
Examples and guides: Relevant cookbook examples have been included or updated (if applicable)
Tested in clean environment
Tests added/updated (if applicable)

Duplicate and AI-Generated PR Check

I have searched existing open pull requests and confirmed that no other PR already addresses this issue
If a similar PR exists, I have explained below why this PR is a better approach
Check if this PR was entirely AI-generated (by Copilot, Claude Code, Cursor, etc.)

Validation

Added unit tests covering:

native sparse priority in _get_sparse_vector()
FastEmbed fallback when native sparse output is unavailable
empty input handling
keyword search using the sparse helper
hybrid search using the sparse helper
clear errors when sparse vectors cannot be generated

These tests validate the new native-sparse-first behavior while preserving the existing FastEmbed fallback.

github-actions · 2026-04-08T20:57:28Z

PR Triage

A few things to address before this PR can be reviewed:

Missing issue link: Please link the issue this PR addresses using fixes #<issue_number>, closes #<issue_number>, or resolves #<issue_number> in the PR description. If there is no existing issue, please create one first.

Missing tests: This PR modifies source code but does not include any test changes. Please add or update tests to cover your changes.

rogernogueira

Add tests

sannya-singal · 2026-04-10T06:31:01Z

Please make sure to fix the failing pipeline as well @rogernogueira.

rogernogueira

add codebook

sannya-singal · 2026-04-13T06:03:33Z

@rogernogueira please make sure that gh pipeline is green.

rogernogueira · 2026-04-14T22:42:09Z

updated the PR to keep sparse vector resolution consistent across the Qdrant paths.

Changes included:

_get_sparse_vector() helper
helper usage in hybrid search
helper usage in keyword search
helper usage in insert / async_insert
unit tests covering native sparse priority and fallback behavior

The branch has also been updated with the latest changes and tests are passing locally.

rogernogueira

I tried to simplify the PR.

rogernogueira

I tried to simplify the PR.

sannya-singal

Why did we revert the cookbook?

Please ensure a green CI pipeline @rogernogueira.

…nt hybrid search

rogernogueira · 2026-04-15T16:12:44Z

I’ve added a cookbook for this change.

rogernogueira

add test

rogernogueira · 2026-04-25T17:42:47Z

Pls review the changes

rogernogueira requested a review from a team as a code owner April 8, 2026 20:57

github-actions Bot added first-time-contributor missing-issue-link missing-tests labels Apr 8, 2026

rogernogueira commented Apr 8, 2026

View reviewed changes

sannya-singal reviewed Apr 9, 2026

View reviewed changes

Comment thread libs/agno/agno/vectordb/qdrant/qdrant.py Outdated

Comment thread libs/agno/agno/vectordb/qdrant/qdrant.py Outdated

Comment thread libs/agno/agno/vectordb/qdrant/qdrant.py Outdated

rogernogueira commented Apr 10, 2026

View reviewed changes

Comment thread libs/agno/agno/vectordb/qdrant/qdrant.py Outdated

Comment thread libs/agno/agno/vectordb/qdrant/qdrant.py Outdated

rogernogueira closed this Apr 14, 2026

rogernogueira force-pushed the fix/qdrant-native-sparse-vectors branch from 2b8a2f8 to 7a4e21b Compare April 14, 2026 20:35

rogernogueira added 6 commits April 14, 2026 17:55

Add sparse vector helper

3126cdc

Use sparse helper in hybrid search

c58f7ef

Use sparse helper in keyword search

c78a427

Use sparse helper in keyword search

c381ef4

Use sparse helper in Qdrant indexing paths

12e334f

Add tests for native sparse vector resolution

0022288

rogernogueira reopened this Apr 14, 2026

rogernogueira commented Apr 14, 2026

View reviewed changes

Merge branch 'main' into fix/qdrant-native-sparse-vectors

ed236b1

sannya-singal self-requested a review April 15, 2026 08:41

sannya-singal reviewed Apr 15, 2026

View reviewed changes

rogernogueira changed the title ~~fix(qdrant): prefer native sparse embeddings when available~~ [fix] prefer native sparse embeddings when available Apr 15, 2026

rogernogueira changed the title ~~[fix] prefer native sparse embeddings when available~~ fix: prefer native sparse embeddings when available Apr 15, 2026

Add cookbook example demonstrating native sparse embeddings with Qdra…

10405d7

…nt hybrid search

rogernogueira requested a review from sannya-singal April 15, 2026 16:14

Merge branch 'main' into fix/qdrant-native-sparse-vectors

a20d052

rogernogueira commented Apr 17, 2026

View reviewed changes

Merge branch 'main' into fix/qdrant-native-sparse-vectors

b13bcf0

Conversation

rogernogueira commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Type of change

Checklist

Duplicate and AI-Generated PR Check

Validation

Uh oh!

github-actions Bot commented Apr 8, 2026

PR Triage

Uh oh!

rogernogueira left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sannya-singal commented Apr 10, 2026

Uh oh!

rogernogueira left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sannya-singal commented Apr 13, 2026

Uh oh!

rogernogueira commented Apr 14, 2026

Uh oh!

rogernogueira left a comment

Choose a reason for hiding this comment

Uh oh!

rogernogueira left a comment

Choose a reason for hiding this comment

Uh oh!

sannya-singal left a comment

Choose a reason for hiding this comment

Uh oh!

rogernogueira commented Apr 15, 2026

Uh oh!

rogernogueira left a comment

Choose a reason for hiding this comment

Uh oh!

rogernogueira commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rogernogueira commented Apr 8, 2026 •

edited

Loading