Insights AI is a local Streamlit application that leverages a fine-tuned LLM (via LM Studio) and PostgreSQL to translate natural-language problem statements into safe, efficient and executable SQL queries. You can use this to directly get results from your data through natural language input without the need of writing SQL queries. It follows core system-prompting principles to ensure accuracy and performance. it also has the option to dsiplay Plotly-based graphs and plots as and when needed.
- Natural-Language to SQL: Converts user prompts into pure, single-line PostgreSQL queries without extraneous labels or explanations
- No internet access needed: Ensuring data privacy
- Schema-Aware Generation: Embeds table and column definitions in the prompt for context
- Query Safety: Enforces rules to prevent destructive commands (no
DROP
,DELETE
), usesILIKE
for case-insensitive matching - Automated Formatting: Post-processes LLM output to correct spacing, restore string literals, and append wildcards
- Interactive UI: Streamlit-based chat interface with data preview and Plotly-powered visualizations (Bar, Line, Scatter, Pie)
- Modular Design: Clear separation of database connectivity, prompt generation, SQL cleaning, and UI components
- Pluggable database backend (PostgreSQL out-of-the-box)
- Easily configurable system prompt and schema context
- LM Studio integration with local inference
A[Streamlit UI] -->|1. User Input| B[Promptizer];
B -->|2. Prompt + Schema| C[LM Studio Completion];
C -->|3. Raw SQL Output| D[SQL Cleaner];
D --> E[DB Exec];
E --> F[Results];
-
Python 3.8 or higher
-
PostgreSQL (local instance) - make sure your data is present in a postgres schema.
-
LM Studio running locally with a compatible model (e.g.,
dolphin3.0-llama3.1-8b
)
- Clone the repository:
git clone https://github.com/gaurisharan/insights-local-ai.git cd insights-local-ai
- Create and activate a virtual environment:
python -m venv venv source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
Update DB_CONFIG
in app.py
(or your main script) with your PostgreSQL credentials:
DB_CONFIG = {
"dbname": "postgres", # Your database name
"user": "username", # Your PostgreSQL username
"password": "yourpassword", # Your PostgreSQL password
"host": "localhost", # Usually 'localhost'
"port": "5432", # Default PostgreSQL port
}
-
Ensure
SYSTEM_PROMPT
andSCHEMA_INFO
reflect your current database schema. -
Configure LM Studio:
Go to the developer tab on the left panel. Then go to the top bar "Select a model to load" and choose your preferred model. Wait for it to load.
Toggle the status option at the top to "Running". Use the API generated in your code.
-
Confirm the LM Studio API endpoint and model in the
generate_sql_query
function:
url = "http://localhost:1234/v1/completions" # LM Studio API URL
payload = {
"model": "dolphin3.0-llama3.1-8b", # Replace with your preferred model
"prompt": prompt,
"max_tokens": 256,
"temperature": 0.2
}
- Launch the Streamlit app:
streamlit run app.py
-
In your browser, enter a natural-language SQL request (for example: Show me all transactions after 2024-01-01 for ICICI Bank)
-
Inspect the generated SQL, view results in a table, and optionally create charts
User: List distinct name and count of transactions grouped by date where amount > 1000
System: Generates and executes a query such as:
SELECT txn_date, COUNT(*) FROM transactions WHERE txn_amount > 1000 GROUP BY txn_date;
Output: Dataframe with counts per date and a corresponding Plotly bar chart
-
app.py: Main Streamlit script, handles UI, orchestration, and plotting
-
requirements.txt: Python dependencies
To incorporate Retrieval-Augmented Generation:
-
Store embeddings of SQL documentation or sample queries in a vector store
-
On each prompt, retrieve top-k relevant documents and include them in the LM prompt
-
Adjust generate_sql_query to merge retrieval context with schema info
Contributions welcome via pull requests—feel free to open issues for feature requests or bugs