[Bug]: <title>value too long for type character varying(255) #1088

jasperchen01 · 2025-03-14T06:46:23Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

I'm using the main branch, demo code as below:

import asyncio
import logging
import os
import time
from dotenv import load_dotenv

from lightrag import LightRAG, QueryParam
from lightrag.llm.zhipu import zhipu_complete
from lightrag.llm.ollama import ollama_embedding
from lightrag.llm.openai import openai_embed,openai_complete_if_cache
from lightrag.utils import EmbeddingFunc
from lightrag.kg.shared_storage import initialize_pipeline_status
import numpy as np


CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
WORKING_DIR = f"{CURRENT_DIR}/lightrag_data"

FILE_PATH = f"{CURRENT_DIR}/../../data_dir/marker_output/2305_15323v1.md"
FILE_NAME = os.path.basename(FILE_PATH)

logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.DEBUG)

if not os.path.exists(WORKING_DIR):
    os.mkdir(WORKING_DIR)

# PG
os.environ["AGE_GRAPH_NAME"] = "dickens"
os.environ["POSTGRES_HOST"] = "localhost"
os.environ["POSTGRES_PORT"] = "5432"
os.environ["POSTGRES_USER"] = "postgres"
os.environ["POSTGRES_PASSWORD"] = ""
os.environ["POSTGRES_DATABASE"] = "lightrag"

# neo4j
os.environ["NEO4J_URI"] = "neo4j://localhost:7687"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "admin123"

async def _llm_model_func(
    prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs
) -> str:
    return await openai_complete_if_cache(
        model="qwen-plus-latest",
        prompt=prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key="***",
        base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
        **kwargs
    )

async def _embedding_func(texts: list[str]) -> np.ndarray:
    return await openai_embed(
        texts,
        model="text-embedding-v3",
        api_key="***",
        base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
    )

async def initialize_rag():
    rag = LightRAG(
        namespace_prefix=FILE_NAME,
        working_dir=WORKING_DIR,
        llm_model_func=_llm_model_func,
        llm_model_max_async=4,
        llm_model_max_token_size=32768,
        enable_llm_cache_for_entity_extract=True,
        embedding_func=EmbeddingFunc(
            embedding_dim=1024,
            max_token_size=8192,
            func=_embedding_func,
        ),
        embedding_batch_num=10,
        embedding_func_max_async=10,
        embedding_cache_config={
            "enabled": "true",
            "similarity_threshold": 0.95,
            "use_llm_check": False,
        },
        kv_storage="PGKVStorage",
        doc_status_storage="PGDocStatusStorage",
        graph_storage="Neo4JStorage",
        vector_storage="PGVectorStorage",
        auto_manage_storages_states=False,
        # llm_model_kwargs={
        #     "response_format": {"type": "json_object"},
        #     "extra_body": {"enable_search": True}
        #     },
        addon_params={
            "language": "Chinese"
        },
    )

    await rag.initialize_storages()
    await initialize_pipeline_status()

    return rag


async def main():
    # Initialize RAG instance
    rag = await initialize_rag()


    # add embedding_func for graph database, it's deleted in commit 5661d76860436f7bf5aef2e50d9ee4a59660146c
    rag.chunk_entity_relation_graph.embedding_func = rag.embedding_func

    with open(FILE_PATH, "r", encoding="utf-8") as f:
        await rag.ainsert(f.read(), ids=[FILE_NAME])


if __name__ == "__main__":
    asyncio.run(main())

And get errors:

error:value too long for type character varying(255)
Failed to extract entities and relationships
Failed to process document doc-6b187f963bb8be55d3cc73c6faf9f7db: value too long for type character varying(255)

Steps to reproduce

No response

Expected Behavior

No response

LightRAG Config Used

Paste your config here

Logs and screenshots

No response

Additional Information

LightRAG Version:
Operating System:
Python Version:
Related Issues:

The text was updated successfully, but these errors were encountered:

JoramMillenaar · 2025-03-14T17:29:21Z

I ran into a similar issue. I had a lot of data revolving around a single entity and LightRAG appends a new description to the entity every time the LLM recognizes that entity in the data. So, if that entity is detected in many parts of your data, it will keep appending new descriptors of that entity, until you get this error saying that the Postgres column only allows for 255 characters.

We should probably look into fixing this. But if you're looking for a quick fix, you could change the field to a TEXT field instead of VARCHAR and rebuild your db.

bzImage · 2025-03-20T20:00:55Z

after doing a git pull, creating again the postgres database and processing a bunch of input files:

--

error:value too long for type character varying(255) Failed to extract entities and relationships Failed to process document EW2401-004: value too long for type character varying(255)
__

JoramMillenaar · 2025-03-20T23:44:16Z

I looked through it and tested it again and it's working for me (It is storing 255+ characters).
Try to rebuild your graph with a fresh db.

There was this PR recently merged #1120 that changed the field again (to a VAR(255) Array), which might be why you're experiencing issues.

jasperchen01 added the bug Something isn't working label Mar 14, 2025

JoramMillenaar mentioned this issue Mar 14, 2025

Updated PSQL's chunk_id field to be a TEXT field #1091

Merged

4 tasks

LarFii mentioned this issue Mar 20, 2025

Resolved issue with upsert of PG entity and relation #1127

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: <title>value too long for type character varying(255) #1088

[Bug]: <title>value too long for type character varying(255) #1088

jasperchen01 commented Mar 14, 2025 •

edited

Loading

JoramMillenaar commented Mar 14, 2025 •

edited

Loading

bzImage commented Mar 20, 2025 •

edited

Loading

JoramMillenaar commented Mar 20, 2025 •

edited

Loading

[Bug]: <title>value too long for type character varying(255) #1088

[Bug]: <title>value too long for type character varying(255) #1088

Comments

jasperchen01 commented Mar 14, 2025 • edited Loading

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

LightRAG Config Used

Paste your config here

Logs and screenshots

Additional Information

JoramMillenaar commented Mar 14, 2025 • edited Loading

bzImage commented Mar 20, 2025 • edited Loading

JoramMillenaar commented Mar 20, 2025 • edited Loading

jasperchen01 commented Mar 14, 2025 •

edited

Loading

JoramMillenaar commented Mar 14, 2025 •

edited

Loading

bzImage commented Mar 20, 2025 •

edited

Loading

JoramMillenaar commented Mar 20, 2025 •

edited

Loading