How to load pre-generated embedding files? #685
Unanswered
rustam-ashurov-mcx
asked this question in
1. Q&A
Replies: 1 comment
-
hi @rustam-ashurov-mcx, what you're describing is very similar to a cache, which would require storing pairs of text and vectors, allowing to search by text. You could implement a cache dependency and inject into embedding generators, using a lookup table. Or you could develop a custom embedding generator, that never talks to OpenAI, and always loads embeddings from storage, taking care of all the cases, e.g. distributed storage, adding new embeddings, etc. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey mates, have a hard time with SK + KM (mostly because of lack of exp of working with both yet) 😅
My aim is to generate embeddings in advance as files, and on service startup upload them in in-memory store without calling AI provider again and again on each restart.
So I generated embeddings via OpenAI client (I was unable to find whether I can generate them via SK/KM and store as files), and now they are stored as *.json files with such (example) content:
"{
"data": [
{
"embedding": [
0.006308248266577721,
....lot of numbers here...
],
"index": 0,
"object": "embedding"
}
],
"model": "text-embedding-ada-002",
"object": "list",
"usage": {
"prompt_tokens": 305,
"total_tokens": 305
}
}"
What I can not understand is how can this data be loaded into KM?
Beta Was this translation helpful? Give feedback.
All reactions