[Bug]: Having trouble just to make it work #1082

GTimothee · 2025-03-13T10:35:29Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

I am trying to use lightrag with my llm deployment (openai api compatible) and it fails at multiple points:

While ingesting data I get this:

Processing documents: 25%|████████▎ | 2/8 [00:01<00:04, 1.35it/s]INFO:lightrag:Inserting 1 to doc_status INFO:lightrag:Stored 1 new unique documents INFO:lightrag:Number of batches to process: 1. INFO:lightrag:Start processing batch 1 of 1. INFO:lightrag:Inserting 1 to doc_status INFO:lightrag:Inserting 1 to chunks INFO:lightrag:Inserting 1 to full_docs INFO:lightrag:Inserting 1 to text_chunks INFO:lightrag:Non-embedding cached missed(mode:default type:extract) ERROR:lightrag:Failed to process document doc-40723ec49f1dad04b4823be95d04b22c: index 0 is out of bounds for axis 0 with size 0

or this:

INFO:lightrag:Non-embedding cached missed(mode:default type:extract) INFO:lightrag:Non-embedding cached missed(mode:default type:extract) INFO:lightrag:Non-embedding cached missed(mode:default type:extract) ERROR:lightrag:Failed to process document doc-984c029a349b75d6a5d1293e65c59695: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4096 and the array at index 1 has size 1024

And when I try to use the rag of course it does not work and gives me error like :

INFO:lightrag:Inserting 1 to llm_response_cache INFO:lightrag:Non-embedding cached missed(mode:default type:extract) INFO:lightrag:Inserting 1 to llm_response_cache INFO:lightrag:Non-embedding cached missed(mode:default type:extract) INFO:lightrag:Inserting 1 to llm_response_cache INFO:lightrag:Non-embedding cached missed(mode:default type:extract) INFO:lightrag:Inserting 1 to llm_response_cache INFO:lightrag:Query nodes: SQuAD, Stanford Question Answering Dataset, Natural Language Processing, Machine learning, top_k: 60, cosine: 0.2 INFO:lightrag:Query edges: Question answering, Language models, Artificial intelligence, top_k: 60, cosine: 0.2 ERROR:lightrag:Error in get_kg_context: shapes (0,4096) and (1024,) not aligned: 4096 (dim 1) != 1024 (dim 0) Sorry, I'm not able to provide an answer to that question.[no-context]

The errors are really not explicit, we don't even know what component fails. I cannot activate logging because your README example is not up to date with the PYPI package and therefore these imports fail (module not found errors):

from lightrag.kg.shared_storage import initialize_pipeline_status from lightrag.utils import setup_logger

Steps to reproduce

Here is the file I am using: https://gist.github.com/GTimothee/32027026e8aef7dc5cb290b9b913953b

Expected Behavior

Just work without error

LightRAG Config Used

Default

Logs and screenshots

No response

Additional Information

LightRAG Version: lightrag-hku==1.2.3
Operating System: linux
Python Version: 3.10.14
Related Issues:

The text was updated successfully, but these errors were encountered:

ekinsenler · 2025-03-13T12:33:59Z

It looks like you didn't set your EMBEDDING_DIM to match your embedding model's dim 4096 in your case.

GTimothee · 2025-03-13T13:42:22Z

thanks for your help. Where should I put the EMBEDDING_DIM then ? I am passing the embedding model to the same rag object that embeds the data and that processes the query so I would expect that it does both with the model I pass ? What I am missing ?

JoramMillenaar · 2025-03-13T16:53:45Z

The error looks unfamiliar, but after you passed the EMBEDDING_DIM, you might need to drop your current vector db's. Since it might be mismatching the dimensions with what you already have stored.

ekinsenler · 2025-03-13T18:35:12Z

You have to set that inside the .env file

bastianwegge · 2025-03-16T13:06:51Z

@GTimothee for me this response helped: #727 (comment)
Essentially, you want to select an embedding-model and the according dimension it comes with, if I understood this correctly.

GTimothee · 2025-03-20T22:30:29Z

Thanks for your answers. No need to set the env variable, I just had to change the argument at embedding model's creation time, I've set an embedding size that did not match my model. I also had the same problem with the llm context size; setting the max context length parameter solved the issue. I have a working project now 👍 (Found my answer in @bastianwegge 's suggestion, thanks)

GTimothee added the bug label Mar 13, 2025

GTimothee closed this as completed Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Having trouble just to make it work #1082

[Bug]: Having trouble just to make it work #1082

GTimothee commented Mar 13, 2025 •

edited

Loading

ekinsenler commented Mar 13, 2025

GTimothee commented Mar 13, 2025

JoramMillenaar commented Mar 13, 2025

ekinsenler commented Mar 13, 2025

bastianwegge commented Mar 16, 2025

GTimothee commented Mar 20, 2025 •

edited

Loading

[Bug]: Having trouble just to make it work #1082

[Bug]: Having trouble just to make it work #1082

Comments

GTimothee commented Mar 13, 2025 • edited Loading

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

LightRAG Config Used

Logs and screenshots

Additional Information

ekinsenler commented Mar 13, 2025

GTimothee commented Mar 13, 2025

JoramMillenaar commented Mar 13, 2025

ekinsenler commented Mar 13, 2025

bastianwegge commented Mar 16, 2025

GTimothee commented Mar 20, 2025 • edited Loading

GTimothee commented Mar 13, 2025 •

edited

Loading

GTimothee commented Mar 20, 2025 •

edited

Loading