I am an aspiring data scientist from Berlin and Gräfenhainichen.
Check out some of my projects.
Your grocery spendings + LLM magic
How do you calculate how much you've spent on cheese if you only have your receipts? You'd have to know all the different product names and abbreviations grocery stores use. Large Language Models can help you with that.
- OCR receipts with Google Cloud Vision API
- Custom algorithm to extract items and prices, flattens lines from crooked scans
- Prompt engineering for data augmentation, categorize abbreviated products, Mistral API
- Semantic search with text embeddings of prompts and categorized products, Mistral API, PostgreSQL vector database
- Dashboard with graphs and tables to provide insights into what's the most expensive product categories, Streamlit
Check out the README of RECEIPT CONTEXTUALIZER for videos!
Give a client recommendation on houses for sale based on their portfolio.
- Data cleaning
- Importing additional data from US census website
- Descriptive statistics
- Translating business needs into statistical measures
- Data visualization
- Stakeholder communications
Check out the README for some data visualizations, the EDA notebook and the presentation