Skip to content

Commit 6d303dd

Browse files
authored
Add files via upload
1 parent 98320e3 commit 6d303dd

10 files changed

+1468
-0
lines changed

README.md

+173
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# DateGraphX (Learning Edition)
2+
3+
[English](#english) | [中文](#chinese)
4+
5+
> ⚠️ **Note**: This is a learning edition. For commercial use, please contact us for customized solutions!
6+
>
7+
> ⚠️ **注意**: 这是学习版本。商业用途请联系我们定制解决方案!
8+
9+
<a name="english"></a>
10+
## 🌟 DateGraphX
11+
12+
An intelligent document analysis system that combines LangChain, Neo4j graph database, and large language models to create a knowledge graph-based RAG (Retrieval-Augmented Generation) application.
13+
14+
### 🖼️ Project Demo
15+
16+
#### Q&A System Interface
17+
![Q&A System](qa.jpg)
18+
19+
#### Knowledge Graph Visualization
20+
![Knowledge Graph](kg.jpg)
21+
22+
### 🚀 Features
23+
24+
- 📊 Automatic Knowledge Graph Construction
25+
- PDF document processing and analysis
26+
- Intelligent text segmentation
27+
- Relationship extraction
28+
- Interactive graph visualization
29+
30+
- 🤖 Natural Language Q&A
31+
- Context-aware responses
32+
- Knowledge graph-based retrieval
33+
- Multi-LLM support (DeepSeek, OpenAI)
34+
- Real-time graph exploration
35+
36+
### 📦 Project Structure
37+
```
38+
DateGraphX/
39+
├── app.py # Main application file
40+
├── api_utils.py # API utilities
41+
├── config.py # Configuration settings
42+
├── data_persistence_utils.py # Data persistence helpers
43+
├── knowledge_graph_utils.py # Knowledge graph functions
44+
├── requirements.txt # Project dependencies
45+
├── cache/ # Cache directory
46+
├── logo.png # Project logo
47+
├── kg.jpg # Knowledge graph demo
48+
└── qa.jpg # Q&A interface demo
49+
```
50+
51+
### 🔧 Installation
52+
53+
1. Clone repository:
54+
```bash
55+
git clone https://github.com/adoresever/DateGraphX.git
56+
cd DateGraphX
57+
```
58+
59+
2. Create and activate conda environment:
60+
```bash
61+
conda create -n datagraphx python=3.10
62+
conda activate datagraphx
63+
```
64+
65+
3. Install dependencies:
66+
```bash
67+
pip install -r requirements.txt
68+
```
69+
70+
4. Start application:
71+
```bash
72+
streamlit run app.py
73+
```
74+
75+
### 🛠️ Requirements
76+
77+
- Python 3.10+
78+
- Neo4j Database Server
79+
- DeepSeek/OpenAI API access
80+
- CUDA-compatible GPU (recommended)
81+
82+
---
83+
84+
<a name="chinese"></a>
85+
## 🌟 DateGraphX 学习版
86+
87+
一个智能文档分析系统,结合了 LangChain、Neo4j 图数据库和大型语言模型,创建了一个基于知识图谱的 RAG(检索增强生成)应用。
88+
89+
### 🖼️ 项目展示
90+
91+
#### 知识图谱可视化
92+
![知识图谱](kg.jpg)
93+
94+
#### 问答系统界面
95+
![问答系统](qa.jpg)
96+
97+
### 🚀 功能特点
98+
99+
- 📊 自动知识图谱构建
100+
- PDF文档处理与分析
101+
- 智能文本分段
102+
- 关系抽取
103+
- 交互式图谱可视化
104+
105+
- 🤖 自然语言问答
106+
- 上下文感知响应
107+
- 基于知识图谱的检索
108+
- 多LLM支持(DeepSeek、OpenAI)
109+
- 实时图谱探索
110+
111+
### 📦 项目结构
112+
```
113+
DateGraphX/
114+
├── app.py # 主应用程序文件
115+
├── api_utils.py # API工具
116+
├── config.py # 配置设置
117+
├── data_persistence_utils.py # 数据持久化助手
118+
├── knowledge_graph_utils.py # 知识图谱功能
119+
├── requirements.txt # 项目依赖
120+
├── cache/ # 缓存目录
121+
├── logo.png # 项目标志
122+
├── kg.jpg # 知识图谱演示
123+
└── qa.jpg # 问答界面演示
124+
```
125+
126+
### 🔧 安装步骤
127+
128+
1. 克隆仓库:
129+
```bash
130+
git clone https://github.com/adoresever/DateGraphX.git
131+
cd DateGraphX
132+
```
133+
134+
2. 创建并激活conda环境:
135+
```bash
136+
conda create -n datagraphx python=3.10
137+
conda activate datagraphx
138+
```
139+
140+
3. 安装依赖:
141+
```bash
142+
pip install -r requirements.txt
143+
```
144+
145+
4. 启动应用:
146+
```bash
147+
streamlit run app.py
148+
```
149+
150+
### 🛠️ 环境要求
151+
152+
- Python 3.10+
153+
- Neo4j 数据库服务器
154+
- DeepSeek/OpenAI API 访问权限
155+
- CUDA兼容GPU(推荐)
156+
157+
## 👥 作者
158+
159+
**王宇** (Yu Wang) - [[email protected]](mailto:[email protected])
160+
161+
## 📝 致谢
162+
163+
本项目的知识图谱部分参考了 [LightRAG](https://github.com/HKUDS/LightRAG)
164+
165+
## 📄 许可证
166+
167+
CC BY-NC-SA 4.0 - 详见 [LICENSE](LICENSE) 文件
168+
169+
---
170+
171+
> 🔒 **商业定制**
172+
>
173+
> 如需商业版本或定制开发,请联系:[[email protected]](mailto:[email protected])

api_utils.py

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# api_utils.py
2+
3+
from typing import List, Tuple
4+
import requests
5+
from openai import OpenAI
6+
from langchain_community.embeddings import OpenAIEmbeddings
7+
from langchain.embeddings.base import Embeddings
8+
from config import API_CONFIG, EMBEDDING_CONFIG
9+
10+
class LocalEmbeddings(Embeddings):
11+
def __init__(self, base_url: str, model: str):
12+
self.base_url = base_url
13+
self.model = model
14+
15+
def embed_documents(self, texts: List[str]) -> List[List[float]]:
16+
url = f"{self.base_url}/embeddings"
17+
embeddings = []
18+
for text in texts:
19+
response = requests.post(url, json={"input": text, "model": self.model})
20+
embeddings.append(response.json()["data"][0]["embedding"])
21+
return embeddings
22+
23+
def embed_query(self, text: str) -> List[float]:
24+
url = f"{self.base_url}/embeddings"
25+
response = requests.post(url, json={"input": text, "model": self.model})
26+
return response.json()["data"][0]["embedding"]
27+
28+
def clean_api_response(response: str, api_type: str) -> str:
29+
"""清理API响应"""
30+
if api_type == "DeepSeek":
31+
return response.replace("<|end▁of▁sentence|>", "").strip()
32+
return response.strip()
33+
34+
def test_api_connection(api_type: str, api_key: str, model_name: str) -> Tuple[bool, str]:
35+
"""测试API连接"""
36+
try:
37+
base_url = API_CONFIG[api_type.lower()]['base_url']
38+
client = OpenAI(api_key=api_key, base_url=base_url)
39+
40+
response = client.chat.completions.create(
41+
model=model_name,
42+
messages=[{"role": "user", "content": "Hi!"}],
43+
max_tokens=10
44+
)
45+
raw_response = response.choices[0].message.content
46+
cleaned_response = clean_api_response(raw_response, api_type)
47+
return True, cleaned_response
48+
except Exception as e:
49+
return False, str(e)
50+
51+
def test_embeddings(embed_type: str, api_key: str = None, base_url: str = None, model: str = None) -> Tuple[bool, str]:
52+
"""测试嵌入模型"""
53+
try:
54+
if embed_type == "本地":
55+
embeddings = LocalEmbeddings(
56+
base_url=base_url or EMBEDDING_CONFIG['local']['base_url'],
57+
model=model or EMBEDDING_CONFIG['local']['model']
58+
)
59+
else:
60+
embeddings = OpenAIEmbeddings(api_key=api_key)
61+
62+
test_embedding = embeddings.embed_query("test")
63+
return True, f"成功生成嵌入向量,维度: {len(test_embedding)}"
64+
except Exception as e:
65+
return False, str(e)
66+
67+
def get_api_client(api_type: str, api_key: str, model_name: str) -> OpenAI:
68+
"""获取API客户端"""
69+
base_url = API_CONFIG[api_type.lower()]['base_url']
70+
return OpenAI(api_key=api_key, base_url=base_url)

0 commit comments

Comments
 (0)