Manufacturing Engineer & Data Analyst with 17 years of experience, specializing in data analysis, open source contribution, and business automation. (製造業にて17年の経験を持つエンジニア。データ分析・OSS貢献・業務自動化を専門としています)
| Japan Geohazard Monitor | Persian Gulf Ship Tracker |
![]() |
|
| 31 geophysical data sources → ML earthquake prediction (AUC 0.754, CSEP Molchan 0.981) + real-time monitoring dashboard | AIS vessel tracking across the Persian Gulf & Gulf of Oman with land mask filtering |
Real-time API / WebSocket → SQLite → FastAPI + Leaflet.js (dark theme) — All projects
Japan Geohazard Monitor — Earthquake prediction research: 76 features from 25+ sources (USGS, Earthdata, INTERMAGNET, NMDB, NOAA, IOC), walk-forward HistGBT + stacking, BigQuery data platform (216K rows). Weekly automated CI pipeline on GitHub Actions.
| Project | Description | Demo |
|---|---|---|
| NPB Season Prediction | Bayesian ensemble (Marcel 35% + Stan/Ridge 40% + ML 25%) + Monte Carlo team simulation + 24 foreign player individual projections | Live |
| NPB 2021 Backtest | Could Bayesian model predict Yakult & Orix last→champion? 25 foreign players with FanGraphs data | Analysis |
| MLB Win Probability Engine | 3-engine ensemble WP (Normal + Empirical + LightGBM) + Gemini AI commentary | Live |
| Baseball MLOps Pipeline | Statcast × GCP MLOps: 5-model ensemble + BQML, weekly auto-retrained on BigQuery + Cloud Run | Live |
| MLB Data Pipeline | Shared BigQuery data platform — FanGraphs + Savant + Statcast for all baseball analytics projects | — |
Prediction accuracy & details
NPB Season Prediction — 2026 Forecast
- Bayesian ensemble: Marcel 35% + Stan/Ridge skill correction 40% + ML (XGBoost/LightGBM) 25%
- CL: 阪神 71.5W (26%) > 巨人 71.1 > 中日 71.0 > DeNA 70.7 — 4 teams within 0.8W / PL: SB 81.3W (48%) > 日ハム 79.1 > オリ 77.5
- 24 foreign players individually projected via Stan v2 (MLB/KBO → NPB conversion)
- 10,000 Monte Carlo sims with park factors (Pythagorean k=1.83)
- 8-year backtest: Bayesian wOBA MAE .0498, ERA MAE 1.222 — 97% probability of beating Marcel
- Article (JP) / Article (EN)
NPB Season Prediction — 2021 Backtest
- Tested if Bayesian model could predict Yakult & Orix going from last place to champions
- 25 foreign players (13 hitters, 12 pitchers) with full FanGraphs data (wOBA, K%, BB%, ERA, FIP, WHIP)
- Result: MAE 10.7 wins (vs 10.4 without foreign predictions) — foreign players had minimal impact
- Key finding: Bayesian regression over-corrects extreme values; model predicted mediocre players well (Cron .703 vs .701 actual) but failed on extremes (Gerber .862 vs .352)
- 2021 standings were driven by Japanese player breakouts/collapses, not foreign players
- Data: baseball-data.com + npb.jp + FanGraphs + Baseball Savant
MLB Win Probability Engine
- 3-engine ensemble: v1 (Markov + Normal + Optuna), v2 (Empirical WP table), LightGBM — inverse-Brier weighted + Isotonic calibration
- 367K+ play states (2015–2024) on BigQuery for training/validation, leave-one-year-out CV
- Gemini 2.5 Flash AI commentary with prompt versioning, quality evaluation (100pt), W&B tracking
- Live feed: MLB Stats API → real-time WP / LI / tactical recommendations (30s auto-refresh)
Baseball MLOps Pipeline
- Accuracy (2025 backtest): Batter wOBA MAE = .0287 (Marcel: .0326) / Pitcher xFIP MAE = 0.483 (Marcel: 0.558)
- 5-model ensemble: LightGBM + CatBoost + ElasticNet Bayes + Component (PECOTA) + BQML Boosted Tree
- GCP: BigQuery (13 raw tables + BQML models) + Cloud Run (FastAPI) + Grafana dashboard
- Pipeline: GitHub Actions cron (weekly) → pybaseball → 5-model retrain → W&B + BigQuery → Cloud Run → Streamlit
Baseball Skeleton Analysis — 3D skeleton visualization from Driveline OpenBiomechanics C3D data
| Pitching Skeleton (3D C3D) | Hitting Skeleton (3D C3D) |
![]() |
![]() |
Trunk rotation range vs pitch speed: r=0.425 (strongest). Contributed bug fix PR #384 to ezc3d. Article (JP) / Article (EN)
6 analyses covering Japanese MLB pitchers and Ohtani batting data.All analyses (6)
| Analysis | Key Finding | Article |
|---|---|---|
| Kikuchi Slider Revolution (2019-2025) | SL 17%→37% after Astros trade | Zenn / DEV.to / Kaggle |
| Senga Ghost Fork (2023-2025) | FO whiff rate 58%→39%, decline pre-injury | Zenn / DEV.to / Kaggle |
| Imanaga 2nd Year (2024-2025) | 3-pitch concentration (97%), 1st TTO xwOBA .505 | Zenn / DEV.to / Kaggle |
| Darvish Evolution (2021-2025) | SL/ST halved, CU became putaway pitch | Zenn / DEV.to / Kaggle |
| Ohtani Spray Chart | spraychart() one-liner vs matplotlib manual | Zenn |
| Ohtani Heatmap | Stadium drawing + hit density heatmap | Zenn |
(55 PRs / 28 Merged) across 22 repositories. See [oss-contributions](https://github.com/yasumorishima/oss-contributions) for full details.
PR highlights (click to expand)
| Repository | PR | Description |
|---|---|---|
| dfinity/icp-js-core | #1270 | Improve Candid decode error messages |
| dfinity/icp-js-core | #1277 | Deduplicate parallel fetchSubnetKeys |
| dfinity/pic-js | #235 | Add fetchCanisterLogs() method |
| line/line-bot-mcp-server | #369 | Add get_follower_ids tool |
| pyomeca/ezc3d | #384 | Fix __eq__ early return bug |
| optuna/optuna | — | Hyperparameter optimization framework |
| pandas-dev/pandas | — | Data analysis library |
| jldbc/pybaseball | #498-504 | Bug fixes & documentation |
team-mirai — Civic Tech OSS (21 PRs (11 Merged / 2 Open / 8 Closed))
Contributing to open-source civic tech projects that promote political transparency and citizen participation in Japan.
| # | Repository | PR | Status | Description |
|---|---|---|---|---|
| 21 | marumie | #1141 | Open | Display total amount when category filter is applied |
| 20 | action-board | #1969 | Merged | Add 48 unit tests for pure functions |
| 19 | action-board | #1918 | Merged | Disable Supabase Image Transformation |
| 18 | action-board | #1914 | Merged | Block shape deletion with XP |
| 17 | post-checker | #34 | Open | Fix timezone-dependent date parsing |
| 16 | action-board | #1906 | Merged | Refactor achieveMissionAction |
| 15 | action-board | #1869 | Merged | Supabase RPC function tests for develop |
| 14 | action-board | #1868 | Merged | Posting count display: times to sheets |
| 13 | action-board | #1867 | Merged | Error toast for poster mission failure |
| 12 | action-board | #1859 | Merged | Supabase RPC function tests |
| 11 | fact-checker | #88 | Closed | Slack same-thread reply |
| 10 | fact-checker | #87 | Closed | Deduplicate tweets using start_time filter |
| 9 | fact-checker | #86 | Closed | Unit tests for Note markdown utilities |
| 8 | action-board | #1856 | Merged | Update video mission description |
| 7 | action-board | #1855 | Closed | Street speech map link |
| 6 | fact-checker | #85 | Closed | Slack button env-based branching |
| 5 | action-board | #1849 | Merged | Breadcrumb navigation |
| 4 | action-board | #1845 | Merged | Fix prefecture cache invalidation |
| 3 | fact-checker | #84 | Closed | Disable Twitter posting in staging |
| 2 | fact-checker | #83 | Closed | Client-side engagement filtering |
| 1 | fact-checker | #69 | Done | X API investigation |
Tech Stack: Next.js, TypeScript, Supabase, shadcn/ui, Biome, Bun, Vitest
Notebooks Expert | 🥉 14 Bronze Notebook Medals
Active: S6E3 Churn (LB 0.914) / Deep Past (Akkadian→English) / RNA 3D Folding 2
Bronze Medal Notebooks (14)
| Notebook | Topic |
|---|---|
| savant-extras Defense & Pitching Quality | Defense metrics & pitching quality analysis (savant-extras) |
| MLB Statcast Spray Charts for WBC 2026 Players | WBC 2026 player spray charts + pitch zone charts (baseball-field-viz) |
| March Machine Learning Mania 2026 Baseline | NCAA basketball tournament prediction (LightGBM + Logistic Regression) |
| CAFA 6 Baseline with Regularization | Protein function prediction (PyTorch MLP) |
| Bat Tracking: Japanese MLB Batters (2024-2025) | MLB bat speed & swing metrics analysis |
| Senga Ghost Fork Analysis | MLB Statcast pitching analysis |
| Kikuchi Slider Revolution | MLB Statcast pitching analysis |
| NFL Geometric Rules Baseline | Physics-based rules, No ML, RMSE 2.921 |
| PhysioNet ECG Baseline | ECG submission format guide |
| Diabetes EDA & Baseline | LightGBM 5-fold CV, AUC 0.727 |
| Diabetes Rank-Based Ensemble | Rank averaging for AUC optimization |
| Deep Past Cloud Workflow + TF-IDF Baseline | Akkadian→English TF-IDF baseline |
| Titanic Japanese Optuna Test | Titanic survival prediction with Optuna tuning |
| Matplotlib & Seaborn 日本語化テンプレート | Kaggle環境の日本語フォント文字化け解消テンプレート |
| Dataset | Description |
|---|---|
| 🥈 MLB Bat Tracking Leaderboard (2024-2025) | 452 batters, 19 swing metrics |
| 🥈 WBC 2026 Scouting | 306 players, 20 countries |
Other datasets (4)
| Dataset | Description |
|---|---|
| Baseball Savant Leaderboards (2024-2025) | 15 leaderboards, 2 seasons combined |
| Japanese MLB Players Statcast (2015-2025) | 34 Japanese MLB players, 174k pitches+hits |
| MLB Pitcher Arsenal Evolution (2020-2025) | 4,253 pitcher-seasons, 111 metrics |
| MLB Statcast + Bat Tracking (2024-2025) | Combined Statcast + bat tracking data |
DrivenData Competitions — Automated pipeline: GitHub Actions + GPU training + GPU→CPU fallback. Currently competing in On Top of Pasketti (Children's ASR, $120K prize, Wav2Vec2 CTC).
| App | Description | Link |
|---|---|---|
| MLB Bat Tracking Dashboard | Leaderboard, Player Comparison, Team Lineup Builder. Powered by savant-extras | Live |
| WBC 2026 Scouting Dashboard | 30 Statcast apps across 19 countries. Zone heatmaps, spray charts, pitch movement | Live |
| Daily Diary | Flutter mobile app, 5 languages, offline-first, AdMob | Google Play |
WBC 2026 Scouting Dashboard details (30 apps)
WBC 2026 Scouting Dashboard — Statcast-based scouting dashboards for all WBC 2026 teams, deployed on Streamlit Community Cloud.
- 30 apps across 19 countries — batters (17 countries) + pitchers (13 countries)
- Features: Zone heatmaps, spray charts, pitch movement, count-by-count performance, LHP/RHP splits
- Data: Baseball Savant Statcast via pybaseball, auto-fetched by GitHub Actions
| Example | Link |
|---|---|
| USA Batters | wbc-usa-batters.streamlit.app |
| Japan Pitchers | wbc-japan-pitchers.streamlit.app |
| All 30 apps | GitHub README |
6 packages (click to expand)
| Package | Description |
|---|---|
| savant-extras | 17 Baseball Savant leaderboards + date range support. Complements pybaseball |
| baseball-field-viz | Statcast coordinate transform + field drawing + spray charts + pitch zone charts |
| kaggle-notebook-deploy | Deploy Kaggle Notebooks via git push + GitHub Actions |
| kaggle-wandb-sync | Sync W&B offline runs from Kaggle to W&B cloud |
| signate-deploy | SIGNATE competition workflow via GitHub Actions |
| signate-wandb-sync | Record SIGNATE scores to W&B runs |
| Project | Description |
|---|---|
| ICP Learning Project | Persistent counter dApp on Internet Computer (Motoko, dfx CLI) |
| OpenClaw Twitter Bot | Raspberry Pi 5 + OpenClaw + Gemini API auto-tweet bot — Article (JP) |
Past Projects
| Project | Description |
|---|---|
| GAS Calendar Tool | Batch calendar event registration with senior-friendly mobile UI |
| Dune Analytics | On-chain data analysis — JPYC Stablecoin Dashboard |
| Archived Projects | Selenium automation, business workflow tools, etc. |
| Category | Technologies |
|---|---|
| Data Analysis & ML | Python, pandas, scikit-learn, LightGBM, XGBoost, CatBoost, PyTorch, matplotlib, seaborn, DuckDB, W&B |
| Data Platform | Google BigQuery (8 datasets, 96+ tables — baseball, geohazard, ship tracking), BigQuery ML, Cloud Run, Grafana |
| Data Sources | Baseball Savant (Statcast), pybaseball, USGS, NASA Earthdata, AIS |
| Web & Dashboards | Streamlit, Next.js, TypeScript, Supabase, shadcn/ui |
| Mobile App | Flutter, Dart, Hive, Google AdMob |
| Automation & DevOps | GitHub Actions, Google Apps Script, VBA, Power Query |
| Tools | Claude Code, Kaggle, Google Colab, Excel, Looker Studio |
| Manufacturing | Statistical Quality Control, Process Engineering |
- 2024 - Present: Quality Management @ Marubun Corporation (丸文株式会社)
- 2020 - 2024: Technical Dept. @ Metaco Corporation (株式会社メタコ)
- 2008 - 2020: Process Engineering in Semiconductor Manufacturing (半導体製造プロセスエンジニア)
Stencil mask and manufacturing method thereof (ステンシルマスク及びその製造方法)
- Patent No: 6307851 (特許第6307851号)
- Role: Inventor (発明者)
- Assignee: Toppan Printing Co., Ltd. (凸版印刷株式会社)
- Link: Google Patents (JP6307851B2)
- Blog: DEV.to (EN) / Zenn (JP) / Quarto Blog (EN)
- Kaggle: https://www.kaggle.com/yasunorim
- Wantedly: https://www.wantedly.com/id/yasunori_morishima_b
- LinkedIn: https://www.linkedin.com/in/morishima-yasunori-b70229241


