docs: add image search docs (#168)

Mini256 · web-flow · commit 364fe762b0c3 · 2025-07-18T08:10:53.000+08:00
* docs: add image search docs

* add desc
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -1,13 +1,14 @@
 {
     "cSpell.words": [
         "FULLTEXT",
-        "Pydantic"
+        "Pydantic",
         "getenv",
         "jina",
         "jinaai",
         "Rerank",
         "reranker",
         "reranking",
-        "tablename"
+        "tablename",
+        "multimodal"
     ]
 }
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -96,6 +96,7 @@ nav:
       - Vector Search: ai/guides/vector-search.md
       - Fulltext Search: ai/guides/fulltext-search.md
       - Hybrid Search: ai/guides/hybrid-search.md
+      - Image Search: ai/guides/image-search.md
       - Auto Embedding: ai/guides/auto-embedding.md
       - Reranking: ai/guides/reranking.md
       - Filtering: ai/guides/filtering.md
@@ -126,6 +127,7 @@ nav:
     - Vector Search: ai/guides/vector-search.md
     - Fulltext Search: ai/guides/fulltext-search.md
     - Hybrid Search: ai/guides/hybrid-search.md
+    - Image Search: ai/guides/image-search.md
     - Auto Embedding: ai/guides/auto-embedding.md
     - Reranking: ai/guides/reranking.md
     - Filtering: ai/guides/filtering.md
diff --git a/src/ai/guides/fulltext-search.md b/src/ai/guides/fulltext-search.md
@@ -15,7 +15,7 @@ TiDB provides full-text search capabilities for **massive datasets** with high p
 
 !!! tip
 
-    For complete example code, see the [full-text search example](https://github.com/pingcap/pytidb/blob/main/examples/fulltext_search).
+    For a complete example of full-text search, see the [E-commerce product search demo](../examples/fulltext-search-with-pytidb.md).
 
 ## Basic Usage
 
diff --git a/src/ai/guides/hybrid-search.md b/src/ai/guides/hybrid-search.md
@@ -10,7 +10,7 @@ TiDB supports both semantic search (also known as vector search) and keyword-bas
 
 !!! tip
 
-    For a complete example of hybrid search, refer to the [hybrid-search example](https://github.com/pingcap/pytidb/tree/main/examples/hybrid_search).
+    For a complete example of hybrid search, refer to the [hybrid-search example](../examples/hybrid-search-with-pytidb.md).
 
 
 ## Basic Usage
diff --git a/src/ai/guides/image-search.md b/src/ai/guides/image-search.md
@@ -0,0 +1,105 @@
+# Image search
+
+**Image search** helps you find similar images by comparing their visual content, not just text or metadata. This feature is useful for e-commerce, content moderation, digital asset management, and any scenario where you need to search for or deduplicate images based on appearance.
+
+TiDB enables image search using **vector search**. With automatic embedding, you can generate image embeddings from image URLs, PIL images, or keyword text using a multimodal embedding model. TiDB then efficiently searches for similar vectors at scale.
+
+!!! tip
+
+    For a complete example of image search, see the [Pet image search demo](../examples/image-search-with-pytidb.md).
+
+## Basic usage
+
+### Step 1. Define an embedding function
+
+To generate image embeddings, you need an embedding model that supports image input.
+
+For demonstration, you can use Jina AI's multimodal embedding model to generate image embeddings.
+
+Go to [Jina AI](https://jina.ai/embeddings) to create an API key, then initialize the embedding function as follows:
+
+```python
+from pytidb.embeddings import EmbeddingFunction
+
+image_embed = EmbeddingFunction(
+    # Or another provider/model that supports multimodal input
+    model_name="jina_ai/jina-embedding-v4",
+    api_key="{your-jina-api-key}",
+)
+```
+
+### Step 2. Create a table and vector field
+
+Use `VectorField()` to define a vector field for storing image embeddings. Set the `source_field` parameter to specify the field that stores image URLs.
+
+```python
+from pytidb.schema import TableModel, Field
+
+class ImageItem(TableModel):
+    __tablename__ = "image_items"
+    id: int = Field(primary_key=True)
+    image_uri: str = Field()
+    image_vec: list[float] = image_embed.VectorField(
+        source_field="image_uri"
+    )
+
+table = client.create_table(schema=ImageItem, mode="overwrite")
+```
+
+### Step 3. Insert image data
+
+When you insert data, the `image_vec` field is automatically populated with the embedding generated from the `image_uri`.
+
+```python
+table.bulk_insert([
+    ImageItem(image_uri="https://example.com/image1.jpg"),
+    ImageItem(image_uri="https://example.com/image2.jpg"),
+    ImageItem(image_uri="https://example.com/image3.jpg"),
+])
+```
+
+### Step 4. Perform image search
+
+Image search is a type of vector search. Automatic embedding lets you input an image URL, PIL image, or keyword text directly. All these inputs are converted to vector embeddings for similarity matching.
+
+#### Option 1: Search by image URL
+
+Search for similar images by providing an image URL:
+
+```python
+results = table.search("https://example.com/query.jpg").limit(3).to_list()
+```
+
+The client converts the input image URL into a vector. TiDB then finds and returns the most similar images by comparing their vectors.
+
+#### Option 2: Search by PIL image
+
+You can also search for similar images by providing an image file or bytes:
+
+```python
+from PIL import Image
+
+image = Image.open("/path/to/query.jpg")
+
+results = table.search(image).limit(3).to_list()
+```
+
+The client converts the PIL image object into a Base64 string before sending it to the embedding model.
+
+#### Option 3: Search by keyword text
+
+You can also search for similar images by providing keyword text. 
+
+For example, if you are working on a pet image dataset, you can search for similar images by keywords like "orange tabby cat" or "golden retriever puppy".
+
+```python
+results = table.search("orange tabby cat").limit(3).to_list()
+```
+
+The keyword text will be converted to a vector embedding that captures the semantic meaning by the multimodal embedding model, and then a vector search will be performed to find the images whose embeddings are most similar to the keyword embedding.
+
+## See also
+
+- [Automatic embedding guide](./auto-embedding.md)
+- [Vector search guide](../concepts/vector-search.md)
+- [Pet image search demo](../examples/image-search-with-pytidb.md)
diff --git a/src/ai/guides/vector-search.md b/src/ai/guides/vector-search.md
@@ -4,7 +4,7 @@ Vector search uses semantic similarity to help you find the most relevant record
 
 !!! tip
 
-    For a complete example of vector search, see the [vector-search example](https://github.com/pingcap/pytidb/tree/main/examples/vector_search).
+    For a complete example of vector search, see the [vector-search example](../examples/vector-search-with-pytidb.md).
 
 
 ## Basic Usage

Original file line number	Diff line number	Diff line change
`@@ -1,13 +1,14 @@`
`1`	`1`	`{`
`2`	`2`	`"cSpell.words": [`
`3`	`3`	`"FULLTEXT",`
`4`		`- "Pydantic"`
	`4`	`+ "Pydantic",`
`5`	`5`	`"getenv",`
`6`	`6`	`"jina",`
`7`	`7`	`"jinaai",`
`8`	`8`	`"Rerank",`
`9`	`9`	`"reranker",`
`10`	`10`	`"reranking",`
`11`		`- "tablename"`
	`11`	`+ "tablename",`
	`12`	`+ "multimodal"`
`12`	`13`	`]`
`13`	`14`	`}`