Skip to content

Commit 7dafe5b

Browse files
committed
Merge branch 'wenxuan/auto-embed' of https://github.com/pingcap/docs into wenxuan/auto-embed
Signed-off-by: Wish <[email protected]>
2 parents 5c9b312 + ca6a494 commit 7dafe5b

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed

tidb-cloud/vector-search-auto-embedding-amazon-titan.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ TiDB Cloud provides the following [Amazon Titan embedding model](https://docs.aw
2323
- Hosted by TiDB Cloud: ✅
2424
- Bring Your Own Key: ❌
2525

26-
You may learn more from [its official documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html).
26+
For more details, see [its official documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html).
2727

2828
## Availability
2929

@@ -78,7 +78,7 @@ Result:
7878

7979
## Options
8080

81-
Additional options may be specified via the `additional_json_options` parameter of the `EMBED_TEXT()` function.
81+
You can specify additional options via the `additional_json_options` parameter of the `EMBED_TEXT()` function.
8282

8383
- `normalize` – (optional) Flag indicating whether or not to normalize the output embedding. Defaults to true.
8484
- `dimensions` – (optional) The number of dimensions the output embedding should have. The following values are accepted: 1024 (default), 512, 256.

tidb-cloud/vector-search-auto-embedding-cohere.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ CREATE TABLE sample (
7474

7575
> **Note**:
7676
>
77-
> For Cohere model, you must specify `input_type` in the `EMBED_TEXT()`. `'{"input_type": "search_document", "input_type@search": "search_query"}'` means `input_type` is set to `search_document` when inserting data, and is set to `search_query` when performing vector search queries.
77+
> For the Cohere model, you must specify `input_type` in the `EMBED_TEXT()` function. For example, `'{"input_type": "search_document", "input_type@search": "search_query"}'` means that `input_type` is set to `search_document` for data insertion and `search_query` for vector searches.
7878
>
7979
> The `@search` suffix is used to mark that field to take effect only when it is used for vector search queries.
8080
@@ -113,9 +113,9 @@ Result:
113113

114114
## Options (TiDB Cloud Hosted)
115115

116-
Both Embed v3 and Multilingual Embed v3 models supports following options, which need to specified via the `additional_json_options` parameter of the `EMBED_TEXT()` function.
116+
Both the Embed v3 and Multilingual Embed v3 models support the following options, which you can specify via the `additional_json_options` parameter of the `EMBED_TEXT()` function.
117117

118-
- `input_type`**Required**. Prepends special tokens to differentiate each type from one another. You should not mix different types together, except when mixing types for for search and retrieval. In this case, embed your corpus with the `search_document` type and embedded queries with type `search_query` type.
118+
- `input_type`**Required**. Prepends special tokens to differentiate each type from one another. You should not mix different types together, except when mixing types for search and retrieval. In this case, embed your corpus with the `search_document` type and embed queries with the `search_query` type.
119119

120120
- `search_document` – In search use-cases, use `search_document` when you encode documents for embeddings that you store in a vector database.
121121
- `search_query` – Use `search_query` when querying your vector DB to find relevant documents.

tidb-cloud/vector-search-auto-embedding-gemini.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@ aliases: ["/tidb/stable/vector-search-auto-embedding-gemini"]
66

77
# Gemini Embeddings
88

9-
All Gemini models are available for use under the `gemini/` prefix when you bring your own Gemini API key. To name a few:
9+
All Gemini models are available for use under the `gemini/` prefix when you bring your own Gemini API key.
1010

1111
**gemini-embedding-001**
1212

1313
- Name: `gemini/gemini-embedding-001`
14-
- Dimensions: 128 - 3072 (default: 3072)
14+
- Dimensions: 1283072 (default: 3072)
1515
- Distance Metric: Cosine / L2
1616
- Max input text tokens: 2048
1717
- Price: Charged by Google
1818
- Hosted by TiDB Cloud: ❌
1919
- Bring Your Own Key: ✅
2020

21-
For a full list of available models, please refer to [Gemini Documentation](https://ai.google.dev/gemini-api/docs/embeddings).
21+
For a full list of available models, please refer to [Gemini documentation](https://ai.google.dev/gemini-api/docs/embeddings).
2222

2323
## Availability
2424

@@ -108,7 +108,7 @@ CREATE TABLE sample (
108108
);
109109
```
110110

111-
For all available options, please refer to [Gemini Documentation](https://ai.google.dev/gemini-api/docs/embeddings).
111+
For all available options, please refer to [Gemini documentation](https://ai.google.dev/gemini-api/docs/embeddings).
112112

113113
## Python Usage Example
114114

0 commit comments

Comments
 (0)