Skip to content

Commit

Permalink
docs: update docs for sql, create, links, examples (#1571)
Browse files Browse the repository at this point in the history
* sql, new create, fixed broken links and examples

* docs: improve explaination of the semantic layer

---------

Co-authored-by: Gabriele Venturi <[email protected]>
  • Loading branch information
gdcsinaptik and gventuri authored Jan 31, 2025
1 parent f667367 commit b0305ef
Show file tree
Hide file tree
Showing 17 changed files with 488 additions and 760 deletions.
4 changes: 2 additions & 2 deletions docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,8 @@
"version": "v3"
},
{
"group": "Data",
"pages": ["v3/data-layer", "v3/semantic-layer", "v3/data-ingestion", "v3/transformations", "v3/dataframes"],
"group": "Data layer",
"pages": ["v3/semantic-layer", "v3/semantic-layer/new", "v3/semantic-layer/views", "v3/data-ingestion", "v3/transformations"],
"version": "v3"
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/v3/ai-dashboards.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: 'Turn your dataframes into collaborative AI dashboards'
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

PandaAI provides a [data platform](https://app.pandabi.ai) that maximizes the power of your [semantic dataframes](/v3/dataframes).
PandaAI provides a [data platform](https://app.pandabi.ai) that maximizes the power of your [semantic dataframes](/v3/semantic-layer).
With a single line of code, you can turn your dataframes into auto-updating AI dashboards - no UI development needed.
Each dashboard comes with a pre-generated set of insights and a conversational agent that helps you and your team explore the data through natural language.

Expand Down
2 changes: 1 addition & 1 deletion docs/v3/chat-and-output.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ You can inspect the code that was generated to produce the result:

```python
response = df.chat("Calculate the correlation between age and salary")
print(response.last_code_generated)
print(response.last_code_executed)
# Output: df['age'].corr(df['salary'])
```

Expand Down
6 changes: 5 additions & 1 deletion docs/v3/cli.mdx
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
---
title: "Command Line Interface"
title: "Command line interface"
description: "Learn how to use PandaAI's command-line interface"
---

<Note title="Beta Notice">
PandaAI 3.0 is currently in beta. This documentation reflects the latest features and functionality, which may evolve before the final release.
</Note>

PandaAI comes with a command-line interface (CLI) that helps you manage your datasets and authentication.

## Authentication
Expand Down
211 changes: 38 additions & 173 deletions docs/v3/data-ingestion.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ file = pai.read_csv("data.csv")
# Use the semantic layer on CSV
df = pai.create(
path="company/sales-data",
name="sales_data",
df = file,
description="Sales data from our retail stores",
columns={
Expand All @@ -50,182 +49,48 @@ response = df.chat("Which product has the highest sales?")

## How to work with SQL in PandaAI?

PandaAI provides a sql extension for you to work with SQL, PostgreSQL, MySQL, SQLite databases.
PandaAI provides a sql extension for you to work with SQL, PostgreSQL, MySQL, and CockroachDB databases.
To make the library lightweight and easy to use, the basic installation of the library does not include this extension.
It can be easily installed using either `poetry` or `pip`.
It can be easily installed using pip with the specific database you want to use:

```bash
poetry add pandasai-sql
pip install pandasai-sql[postgres]
pip install pandasai-sql[mysql]
pip install pandasai-sql[cockroachdb]
```

```bash
pip install pandasai-sql
```

Once you have installed the extension, you can use it to connect to SQL databases.

### PostgreSQL

```yaml
name: sales_data

source:
type: postgres
connection:
host: db.example.com
port: 5432
database: analytics
user: ${DB_USER}
password: ${DB_PASSWORD}
table: sales_data

destination:
type: local
format: parquet
path: company/sales-data

columns:
- name: transaction_id
type: string
description: Unique identifier for each sale
- name: sale_date
type: datetime
description: Date and time of the sale
- name: product_id
type: string
description: Product identifier
- name: quantity
type: integer
description: Number of units sold
- name: price
type: float
description: Price per unit

transformations:
- type: convert_timezone
params:
column: sale_date
from: UTC
to: America/New_York
- type: calculate
params:
column: total_amount
formula: quantity * price

update_frequency: daily
Once you have installed the extension, you can use the [semantic data layer](/v3/semantic-layer#for-sql-databases-using-the-create-method) and perform [data transformations](/docs/v3/transformations).

order_by:
- sale_date DESC

limit: 100000
```
### MySQL
```yaml
name: customer_data

source:
type: mysql
connection:
host: db.example.com
port: 3306
database: analytics
user: ${DB_USER}
password: ${DB_PASSWORD}
table: customers

destination:
type: local
format: parquet
path: company/customer-data

columns:
- name: customer_id
type: string
description: Unique identifier for each customer
- name: name
type: string
description: Customer's full name
- name: email
type: string
description: Customer's email address
- name: join_date
type: datetime
description: Date when customer joined
- name: total_purchases
type: integer
description: Total number of purchases made

transformations:
- type: anonymize
params:
column: email
- type: split
params:
column: name
into: [first_name, last_name]
separator: " "

update_frequency: daily

order_by:
- join_date DESC

limit: 100000
```
### SQLite
```yaml
name: inventory_data

source:
type: sqlite
connection:
database: path/to/database.db
table: inventory

destination:
type: local
format: parquet
path: company/inventory-data

columns:
- name: product_id
type: string
description: Unique identifier for each product
- name: product_name
type: string
description: Name of the product
- name: category
type: string
description: Product category
- name: stock_level
type: integer
description: Current quantity in stock
- name: last_updated
type: datetime
description: Last inventory update timestamp

transformations:
- type: categorize
params:
column: stock_level
bins: [0, 10, 50, 100, 500]
labels: ["Critical", "Low", "Medium", "High"]
- type: convert_timezone
params:
column: last_updated
from: UTC
to: America/Los_Angeles

update_frequency: hourly

order_by:
- last_updated DESC

limit: 50000
```python
sql_table = pai.create(
path="example/mysql-dataset",
description="Heart disease dataset from MySQL database",
source={
"type": "mysql",
"connection": {
"host": "database.example.com",
"port": 3306,
"user": "${DB_USER}",
"password": "${DB_PASSWORD}",
"database": "medical_data"
},
"table": "heart_data",
"columns": [
{"name": "Age", "type": "integer", "description": "Age of the patient in years"},
{"name": "Sex", "type": "string", "description": "Gender of the patient (M = male, F = female)"},
{"name": "ChestPainType", "type": "string", "description": "Type of chest pain (ATA, NAP, ASY, TA)"},
{"name": "RestingBP", "type": "integer", "description": "Resting blood pressure in mm Hg"},
{"name": "Cholesterol", "type": "integer", "description": "Serum cholesterol in mg/dl"},
{"name": "FastingBS", "type": "integer", "description": "Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)"},
{"name": "RestingECG", "type": "string", "description": "Resting electrocardiogram results (Normal, ST, LVH)"},
{"name": "MaxHR", "type": "integer", "description": "Maximum heart rate achieved"},
{"name": "ExerciseAngina", "type": "string", "description": "Exercise-induced angina (Y = yes, N = no)"},
{"name": "Oldpeak", "type": "float", "description": "ST depression induced by exercise relative to rest"},
{"name": "ST_Slope", "type": "string", "description": "Slope of the peak exercise ST segment (Up, Flat, Down)"},
{"name": "HeartDisease", "type": "integer", "description": "Heart disease diagnosis (1 = present, 0 = absent)"}
]
}
)
```

## How to work with Enterprise Cloud Data in PandaAI?
Expand Down Expand Up @@ -590,8 +455,8 @@ limit: 100000
</tr>
<tr>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}>pandasai_sql</td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>poetry add pandasai-sql</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>pip install pandasai-sql</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>poetry add pandasai-sql[postgres]</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>pip install pandasai-sql[postgres]</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}>No</td>
</tr>
<tr>
Expand Down
20 changes: 0 additions & 20 deletions docs/v3/data-layer.mdx

This file was deleted.

44 changes: 0 additions & 44 deletions docs/v3/dataframes.mdx

This file was deleted.

Loading

0 comments on commit b0305ef

Please sign in to comment.