Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .cursor/logging.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
description: Logging should be actionable, structured, and follow professional practices
alwaysApply: true
---

- Never log and raise at the same time. Let logs occur where the exception is handled.
- Use `.bind()` to attach contextual fields like `job_id`, `trainer`, `dataset`, etc.
- Propagate the bound logger across modules instead of re-creating it.
- Use semantic log levels:
- `debug`: Diagnostic/internal details, not printed in prod
- `info`: Expected events (training started, dataset loaded, etc.)
- `warning`: Unexpected but handled (e.g. missing columns)
- `error`: Real failures that may abort a process
- Avoid logs like `"Starting function"` — instead log **what happened**, with context.
- Avoid excessive logging in loops. Log summary or batch progress only.
- Enable JSON structured logs only in production environments.
- Keep logs relevant to external observability, not internal dev experimentation.

**Checklist for evaluating a log:**
- Does it convey a real event, not just function entry?
- Is it at the correct level (not everything is warning)?
- Is the logger properly bound with job/task context?
- Can this log help diagnose or monitor the system?
19 changes: 19 additions & 0 deletions .cursor/modular-design-principles.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
description: Enforce modular design with clear responsibility boundaries
alwaysApply: true
---

- Follow **Single Responsibility Principle (SRP)**: one reason to change per module/class/function.
- Avoid **over-engineering**: modularize only when it improves clarity, reuse or testability.
- Group related logic in packages: `data/`, `services/`, `utils/`, `models/`, etc.
- Keep modules cohesive but not overly fragmented.
- Don't split code into separate files unless they evolve independently or have distinct lifecycles.

**Checklist for evaluating modularity:**
- Can the module be reused independently?
- Does it hide internal details behind a clean interface?
- Would adding more features bloat its logic?

Example:
✅ `email_service.py` sends emails
🚫 `email_sender.py`, `email_templates.py`, `email_config.py` if too small and tightly coupled
56 changes: 56 additions & 0 deletions .cursor/structured-import-order.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
description: Enforce structured and absolute import order for Python files
globs:
- "**/*.py"
alwaysApply: true
---

Group, order, and format Python imports following these conventions:

### ✅ Use **absolute imports only**:
Always write imports using the full path from the project root.
**Avoid** relative imports like `from .module import foo` or `from ..utils import bar`.

- ✅ `from my_project.utils import foo`
- ❌ `from .utils import foo`
- ❌ `from ..submodule import bar`

This ensures clarity, consistency across refactors, and better compatibility with tools like linters, IDEs, and packaging systems.

---

### 📚 Import grouping and ordering:

Organize imports into **three groups**, separated by a blank line:

1. **Standard Library Imports**
e.g., `import os`, `from datetime import datetime`

2. **Third-Party Library Imports**
e.g., `import numpy as np`, `from sqlalchemy import Column`

3. **Local Application/Library Imports**
e.g., `from my_project.utils import helper_function`

---

### 🔤 Within each group:

- First: `import module` (in alphabetical order)
- Then: `from module import name` (in alphabetical order)

---

### ✅ Example:

```python
import os
import sys
from datetime import datetime

import numpy as np
import pandas as pd
from sqlalchemy import Column, Integer

import my_project
from my_project.utils import helper_function
28 changes: 28 additions & 0 deletions .cursor/type-hinting.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
description: Enforce strict type annotations using Python 3.12+ standards
globs:
- "**/*.py"
alwaysApply: true
---

- Always annotate:
- Function parameters and return types, even for `None`, `bool`, `str`, etc.
- Variable declarations (including attributes and loop vars) when static typing helps comprehension.
- Lambda expressions when assigned or passed as arguments.
- **Avoid** `Any`, `Optional`, `Union`, `List`, `Dict`, etc. unless justified.
- **Prefer** Python 3.12+ standard generics:
- Use `list[str]`, `dict[str, int]`, `tuple[str, int]`
- Avoid `List`, `Dict`, `Tuple` from `typing`
- Use `|` (pipe) syntax for unions:
- Use `str | None`
- Avoid `Optional[str]`
- Use `Self` and `classmethod`/`staticmethod` hints as per [PEP 673](https://peps.python.org/pep-0673/)

**Examples:**

```python
def greet(name: str | None = None) -> str:
return f"Hello, {name or 'World'}"

def parse(data: str) -> dict[str, int]:
...
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ This project uses [structlog](https://www.structlog.org/en/stable/) for structur

Structured logs make it easier to parse, search, and analyze logs in production systems, especially when using centralized logging tools like Loki, ELK, or Datadog.

## IDE

This project has some [cursor|https://cursor.com/] rules under `.cursor` that help us write some code, but you are not forced to use it as this is optional.


### Examples

1. Pass context information as key word arguments
Expand Down