Junior Data Engineer building end-to-end data ecosystems: sourcing, cleansing, modeling, and delivering insights that accelerate experimentation and product decisions. Combining software engineering, statistics, and ML/AI to transform datasets into production-ready products including feature stores, experimentation platforms, forecasting services, and data applications.
Currently based in Sugar Land, TX. Open to internships in Data Science and ML/AI.
I'm currently working on automating ingestion pipelines and evaluation dashboards for ML/AI teams. I'm looking to collaborate on data science and machine learning projects with a focus on production systems and scalable solutions. I'm currently learning advanced deep learning architectures, distributed data processing, and MLOps practices.
Ask me about data engineering, machine learning model deployment, feature engineering, and building data pipelines.
Languages: Python, R, SQL, Java, JavaScript, TypeScript, C/C++, HTML/CSS
AI/ML: scikit-learn, XGBoost, CatBoost, LightGBM, TensorFlow, PyTorch, Keras, Transformers
Data/Viz: pandas, NumPy, SciPy, Dask, GeoPandas, Statsmodels, Matplotlib, Seaborn, Plotly, Tableau
Cloud & Tools: AWS (S3, Athena, QuickSight), SQLAlchemy, FastAPI, Streamlit, BeautifulSoup, Selenium
- Stockly - Production-quality stock market prediction and backtesting system with SQLite-based data storage, comprehensive feature engineering, and multiple ML models (LSTM/GRU, Logistic Regression, Random Forest)
- PL Predictor - English Premier League match outcome predictor using XGBoost with web scraping for real-time data updates
- Localytics - Market segmentation and geospatial analytics project combining demographic and behavioral data analysis with clustering algorithms and Tableau dashboards
- clinix.ai - Medical triage and symptom-to-risk assessment system using LLM semantic interpretation and classical ML models with FastAPI backend and Streamlit dashboard