Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation Instructions modified #91

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 8 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@
### 🚌 What's DataJourney?
DataJourney demonstrates how organizations can effectively manage and utilize data by harnessing the power of open-source technologies. It's designed to help navigate the complex landscape of data tools, offering a structured approach to building **scalable**, and **reproducible** data workflows.

Built on open-source principles, the framework guides users through essential steps—from **identifying** goals and **selecting tools** to **testing** and **customising** workflows. With its flexible, modular design, DataJourney can be tailored to individual needs, making it an invaluable toolkit for data professionals.

Check notice on line 20 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L20

[Google.Passive] In general, use active voice instead of passive voice ('be tailored').
Raw output
{"message": "[Google.Passive] In general, use active voice instead of passive voice ('be tailored').", "location": {"path": "README.md", "range": {"start": {"line": 20, "column": 233}}}, "severity": "INFO"}

### 🧱 Design Philosophy (LEGO)
Built with additive, subtractive capabilities glued with open source.
Each layer has a certain strength of communication inbuilt

- PO (Base): Static home(s) to keep it together `(GitHub)`
- P1 (Tooling): Tooling, strings `(Powered by open source)`
- P2 (Maintenance + Monitoring): Env, automations `(Pixi + GHA)`
- PO (Base): Static home(s) to keep it together `(GitHub)`
- P1 (Tooling): Tooling, strings `(Powered by open source)`
- P2 (Maintenance + Monitoring): Env, automations `(Pixi + GHA)`

Check failure on line 28 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L28

[Vale.Spelling] Did you really mean 'Env'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'Env'?", "location": {"path": "README.md", "range": {"start": {"line": 28, "column": 34}}}, "severity": "ERROR"}

Check failure on line 28 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L28

[Vale.Spelling] Did you really mean 'automations'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'automations'?", "location": {"path": "README.md", "range": {"start": {"line": 28, "column": 39}}}, "severity": "ERROR"}
- P3 (Abstraction): Layer(s), CLI/task manager for users to interact with `(Pixi)`


Expand All @@ -39,49 +39,41 @@
✅ `GitHub actions` configured\
✅ `Vale.sh` configured at PR level\
✅ `Pre-commit hooks` configured for code linting/formatting\
✅ Environment management via [pixi](https://prefix.dev/)\

Check failure on line 42 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L42

[Vale.Spelling] Did you really mean 'pixi'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'pixi'?", "location": {"path": "README.md", "range": {"start": {"line": 42, "column": 31}}}, "severity": "ERROR"}
✅ Reading data from online sources using [intake](https://github.com/intake/intake)\
✅ Sample pipeline built using [Dagster](https://github.com/dagster-io/dagster)\

Check failure on line 44 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L44

[Vale.Spelling] Did you really mean 'Dagster'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'Dagster'?", "location": {"path": "README.md", "range": {"start": {"line": 44, "column": 32}}}, "severity": "ERROR"}
✅ Building Dashboard using [holoviews](https://holoviews.org/gallery/index.html) + [panel](https://panel.holoviz.org/reference/index.html)\

Check failure on line 45 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L45

[Vale.Spelling] Did you really mean 'holoviews'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'holoviews'?", "location": {"path": "README.md", "range": {"start": {"line": 45, "column": 29}}}, "severity": "ERROR"}
✅ Exploratory data analysis (EDA) using [mito](https://www.trymito.io/)\

Check failure on line 46 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L46

[Vale.Spelling] Did you really mean 'mito'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'mito'?", "location": {"path": "README.md", "range": {"start": {"line": 46, "column": 42}}}, "severity": "ERROR"}
✅ Web UI build on [Flask](https://flask.palletsprojects.com/en/3.0.x/) \
✅ Web UI re-done and expanded with [FastHTML](https://docs.fastht.ml/)\
✅ Leverage AI models to analyse data [GitHub AI models Beta](https://docs.github.com/en/github-models/prototyping-with-ai-models)

### 📊 Repository stats
### Quickly getting started with DataJourney

⚙️ Managed by GitHub Action: https://github.com/jgehrcke/github-repo-stats \
⏳ Configured to run daily at 23:55:00 IST\
📬 Checkout daily reports generated: [DataJourney Stats on Web](https://sayantikabanik.github.io/DataJourney/)

### Environment setup using pixi:

#### Installing pixi & getting started
- Clone DJ `[email protected]:sayantikabanik/DataJourney.git`
- Switch directory `cd DataJourney`
- Download pixi : [prefix.dev](https://prefix.dev/)

Check failure on line 55 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L55

[Vale.Spelling] Did you really mean 'pixi'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'pixi'?", "location": {"path": "README.md", "range": {"start": {"line": 55, "column": 12}}}, "severity": "ERROR"}
- Activate env: `pixi shell`

Check failure on line 56 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L56

[Vale.Spelling] Did you really mean 'env'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'env'?", "location": {"path": "README.md", "range": {"start": {"line": 56, "column": 12}}}, "severity": "ERROR"}
- Install DJ framework locally `pixi run DJ_package`
- List all the tasks: `pixi task list`
- Execute a task from the list: `pixi run <TASK>`
- Execute a task with verbosity enabled: `pixi run -v <TASK>`

# current tasks present under DJ
### Active `tasks` under DJ

- GIT_TOKEN_CHECK
- DJ_package

Check failure on line 65 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L65

[Vale.Spelling] Did you really mean 'DJ_package'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_package'?", "location": {"path": "README.md", "range": {"start": {"line": 65, "column": 3}}}, "severity": "ERROR"}
- DJ_pre_commit

Check failure on line 66 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L66

[Vale.Spelling] Did you really mean 'DJ_pre_commit'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_pre_commit'?", "location": {"path": "README.md", "range": {"start": {"line": 66, "column": 3}}}, "severity": "ERROR"}
- DJ_dagster

Check failure on line 67 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L67

[Vale.Spelling] Did you really mean 'DJ_dagster'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_dagster'?", "location": {"path": "README.md", "range": {"start": {"line": 67, "column": 3}}}, "severity": "ERROR"}
- DJ_fasthtml_app

Check failure on line 68 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L68

[Vale.Spelling] Did you really mean 'DJ_fasthtml_app'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_fasthtml_app'?", "location": {"path": "README.md", "range": {"start": {"line": 68, "column": 3}}}, "severity": "ERROR"}
- DJ_flask_app

Check failure on line 69 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L69

[Vale.Spelling] Did you really mean 'DJ_flask_app'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_flask_app'?", "location": {"path": "README.md", "range": {"start": {"line": 69, "column": 3}}}, "severity": "ERROR"}
- DJ_mito_app

Check failure on line 70 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L70

[Vale.Spelling] Did you really mean 'DJ_mito_app'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_mito_app'?", "location": {"path": "README.md", "range": {"start": {"line": 70, "column": 3}}}, "severity": "ERROR"}
- DJ_panel_app

Check failure on line 71 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L71

[Vale.Spelling] Did you really mean 'DJ_panel_app'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_panel_app'?", "location": {"path": "README.md", "range": {"start": {"line": 71, "column": 3}}}, "severity": "ERROR"}
- DJ_llm_analysis

Check failure on line 72 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L72

[Vale.Spelling] Did you really mean 'DJ_llm_analysis'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'DJ_llm_analysis'?", "location": {"path": "README.md", "range": {"start": {"line": 72, "column": 3}}}, "severity": "ERROR"}

#### Install the package locally
```shell
pixi run DJ_package
```

#### 🔌 About pre-commit-hooks and activating
Just like the name suggests, pre-commit-hooks are designed to format the code based on PEP standards before committing. [More details](https://pre-commit.com/)

Check notice on line 76 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L76

[Google.Passive] In general, use active voice instead of passive voice ('are designed').
Raw output
{"message": "[Google.Passive] In general, use active voice instead of passive voice ('are designed').", "location": {"path": "README.md", "range": {"start": {"line": 76, "column": 47}}}, "severity": "INFO"}

```shell
pixi run DJ_pre_commit
Expand All @@ -94,12 +86,12 @@
pixi run DJ_llm_analysis
```

#### Dagster UI

Check failure on line 89 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L89

[Vale.Spelling] Did you really mean 'Dagster'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'Dagster'?", "location": {"path": "README.md", "range": {"start": {"line": 89, "column": 6}}}, "severity": "ERROR"}

```shell
pixi run DJ_dagster
```
![Dagit UI output](assets/pipeline/dagster_ui.png)

Check failure on line 94 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L94

[Vale.Spelling] Did you really mean 'Dagit'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'Dagit'?", "location": {"path": "README.md", "range": {"start": {"line": 94, "column": 3}}}, "severity": "ERROR"}

#### Panel app
```shell
Expand All @@ -107,11 +99,11 @@
```

*NOTE:*
The dashboard generated is exported into HTML format and saved as [stock_price_twilio_dashboard](analytics_framework%2Fdashboard%2Fstock_price_twilio_dashboard.html)

Check notice on line 102 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L102

[Google.Passive] In general, use active voice instead of passive voice ('is exported').
Raw output
{"message": "[Google.Passive] In general, use active voice instead of passive voice ('is exported').", "location": {"path": "README.md", "range": {"start": {"line": 102, "column": 25}}}, "severity": "INFO"}

Check failure on line 102 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L102

[Vale.Spelling] Did you really mean 'stock_price_twilio_dashboard'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'stock_price_twilio_dashboard'?", "location": {"path": "README.md", "range": {"start": {"line": 102, "column": 68}}}, "severity": "ERROR"}

![Panel app output](assets/dashboard/panel_app_stock.png)

#### Mito

Check failure on line 106 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L106

[Vale.Spelling] Did you really mean 'Mito'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'Mito'?", "location": {"path": "README.md", "range": {"start": {"line": 106, "column": 6}}}, "severity": "ERROR"}

To explore further visit [trymito.io](https://docs.trymito.io/)
```shell
Expand All @@ -121,8 +113,8 @@
[//]: # (![mito output]&#40;assets/pipeline/mito_graph.png "Graph generated via mitosheet"&#41; ![mito output operation]&#40;assets/pipeline/mito_operations.png "Operations performed via mitosheet"&#41;)

<div style="display: flex; justify-content: space-between;">
<img src="assets/pipeline/mito_graph.png" alt="mito_output" width="400"/>

Check failure on line 116 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L116

[Vale.Spelling] Did you really mean 'mito_output'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'mito_output'?", "location": {"path": "README.md", "range": {"start": {"line": 116, "column": 52}}}, "severity": "ERROR"}
<img src="assets/pipeline/mito_operations.png" alt="mito_output" width="400"/>

Check failure on line 117 in README.md

View workflow job for this annotation

GitHub Actions / vale

[vale] README.md#L117

[Vale.Spelling] Did you really mean 'mito_output'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'mito_output'?", "location": {"path": "README.md", "range": {"start": {"line": 117, "column": 57}}}, "severity": "ERROR"}
</div>

#### Display all data sources present via web UI
Expand Down
Loading