Skip to content

Add Context/Instruction Support to Embedding API #5082

@kacetal

Description

@kacetal

Modern embedding models like Qwen3-Embedding support query-document asymmetry through instruction prompts. The API should allow passing optional context/instruction when generating embeddings:

// Generate query embedding
EmbeddingRequest queryRequest = EmbeddingRequest.builder()
    .inputs(List.of("machine learning"))
    .instruction("query: ")
    .build();

EmbeddingResponse queryResponse = embeddingModel.call(queryRequest);

// Generate document embedding  
EmbeddingRequest docRequest = EmbeddingRequest.builder()
    .inputs(List.of("machine learning"))
    .instruction("passage: ")
    .build();

EmbeddingResponse docResponse = embeddingModel.call(docRequest);
// These produce different vectors optimized for retrieval

API additions:

  • Add optional instruction field to EmbeddingRequest
  • Add instruction getter/setter to EmbeddingOptions interface
  • Default to null for backward compatibility

Current Behavior

Spring AI's EmbeddingModel and EmbeddingRequest only accept text content and basic options (model, dimensions). There's no way to pass instructions/context to the embedding model, even when the underlying model supports it.

This means the same text always generates the same embedding, regardless of whether it's being used as a query or document, losing significant retrieval accuracy improvements.

Context

How this affects usage:
When using models like Qwen3-Embedding or E5 that support instructions, I cannot leverage their query-document asymmetry feature, which significantly improves semantic search performance.

What I'm trying to accomplish:
Build a semantic search system where queries and documents are embedded differently to improve retrieval accuracy, as recommended by modern embedding model best practices.

Models affected:

  • Qwen3-Embedding (0.6B, 7.6B)
  • E5 series (e5-base, e5-large, e5-mistral-7b)
  • BGE series with instructions
  • Instructor embedding models

Current workaround:
Manually prepend instructions to text (e.g., "query: " + text), but this is less clean and may not work identically to native instruction handling in the model's API.

How other APIs handle this:

Qwen native API:

model.encode("text", prompt="query: ")  # Different vectors
model.encode("text", prompt="passage: ")

OpenAI-compatible APIs (vLLM):

{
  "input": "text",
  "extra_body": {"prompt_name": "query"}
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions