diff --git a/FORK_README.md b/FORK_README.md new file mode 100644 index 0000000..bce019b --- /dev/null +++ b/FORK_README.md @@ -0,0 +1,172 @@ +# xword-dl fork with Seattle Times Midi Support + +This is a fork of [thisisparker/xword-dl](https://github.com/thisisparker/xword-dl) with added support for Seattle Times Midi crossword puzzles. + +## What's Added + +### Seattle Times Midi Downloader + +Command: `stm` + +Downloads Seattle Times Midi crossword puzzles - smaller crosswords (9×9 to 11×11) with 30-44 clues, perfect for mobile devices. + +**Features:** +- Latest puzzle: `xword-dl stm --latest` +- Date-based: `xword-dl stm --date "May 1, 2026"` +- Grid sizes: 9×9, 10×10, 11×11 +- Significantly fewer clues than standard 15×15 puzzles (30-44 vs 70-80) + +**Implementation:** +- File: `src/xword_dl/downloader/seattletimesdownloader.py` +- Base class: `AmuseLabsDownloader` (same as LA Times, Newsday) +- Platform: AmuseLabs PuzzleMe +- Puzzle set: `seattletimes-crossword-midi` + +### Branch Information + +- **Branch:** `feature/seattle-times-midi` +- **Status:** Working and tested +- **Integration:** Used by [slmingol/crossword-catastrophe](https://github.com/slmingol/crossword-catastrophe) + +## Installation + +### From this fork + +```bash +pip install git+https://github.com/slmingol/xword-dl.git@feature/seattle-times-midi +``` + +### Local development + +```bash +git clone https://github.com/slmingol/xword-dl.git +cd xword-dl +git checkout feature/seattle-times-midi +python3 -m venv .venv +source .venv/bin/activate +pip install -e . +``` + +## Usage Examples + +```bash +# Download latest Seattle Times Midi puzzle +xword-dl stm --latest + +# Download puzzle from specific date +xword-dl stm --date "yesterday" +xword-dl stm --date "May 1, 2026" + +# Custom output filename +xword-dl stm --latest -o ~/puzzles/seattle-midi-today.puz +``` + +## Testing + +```bash +# Test latest download +xword-dl stm --latest -o /tmp/test.puz + +# Verify puzzle +python3 -c " +import puz +p = puz.read('/tmp/test.puz') +print(f'Title: {p.title}') +print(f'Author: {p.author}') +print(f'Size: {p.width}×{p.height}') +print(f'Clues: {len(p.clues)}') +" +``` + +**Expected output:** +``` +Title: [Puzzle Title] +Author: Phil Fraas +Size: 9×9 (or 10×10, 11×11) +Clues: 30-44 +``` + +## Technical Details + +### API Endpoints + +- **Picker:** `https://seattletimes.amuselabs.com/puzzleme/date-picker?set=seattletimes-crossword-midi` +- **Crossword:** `https://seattletimes.amuselabs.com/puzzleme/crossword?id={puzzle_id}&set=seattletimes-crossword-midi` + +### Puzzle ID Format + +- Sequential IDs: `midi-crossword-111`, `midi-crossword-110`, etc. +- Not date-based (requires lookup via picker API) + +### Date Lookup + +Since puzzles use sequential IDs rather than date-based IDs, the downloader: +1. Fetches the picker page +2. Parses puzzle metadata JSON +3. Matches requested date to publication timestamp +4. Extracts puzzle ID +5. Downloads puzzle + +### Archive Depth + +Limited to ~14 days of recent puzzles based on observed data. + +## Integration with crossword-catastrophe + +This fork is used by the [crossword-catastrophe](https://github.com/slmingol/crossword-catastrophe) project scraper: + +```dockerfile +# packages/scraper/Dockerfile +RUN pip install --no-cache-dir git+https://github.com/slmingol/xword-dl.git@feature/seattle-times-midi +``` + +The scraper automatically downloads puzzles from multiple sources including Seattle Times Midi. + +## Contributing Back to Upstream + +This feature can be contributed back to the main xword-dl project: + +1. Ensure tests pass +2. Create PR: https://github.com/thisisparker/xword-dl/compare/main...slmingol:feature/seattle-times-midi +3. Include documentation and test results + +### Why This Might Be Accepted Upstream + +- Uses existing `AmuseLabsDownloader` infrastructure +- Follows established patterns (similar to LA Times, Newsday) +- Minimal code addition (~70 lines) +- Provides value: smaller puzzles for mobile users +- Fully tested and working + +## Changes from Upstream + +Only additions, no modifications to existing code: + +``` +new file: src/xword_dl/downloader/seattletimesdownloader.py +``` + +The downloader is automatically discovered via `get_plugins()` in `downloader/__init__.py`. + +## Maintenance + +To update this fork with upstream changes: + +```bash +cd xword-dl +git remote add upstream https://github.com/thisisparker/xword-dl.git +git fetch upstream +git checkout feature/seattle-times-midi +git rebase upstream/main +git push slmingol feature/seattle-times-midi --force-with-lease +``` + +## License + +Same as upstream: MIT License + +## Credits + +- Original xword-dl: [Parker Higgins](https://github.com/thisisparker) +- Seattle Times Midi support: Added for [crossword-catastrophe](https://github.com/slmingol/crossword-catastrophe) project +- Inspiration: LA Times and Newsday downloaders diff --git a/SEATTLE_TIMES_README.md b/SEATTLE_TIMES_README.md new file mode 100644 index 0000000..3238356 --- /dev/null +++ b/SEATTLE_TIMES_README.md @@ -0,0 +1,84 @@ +# Seattle Times Midi Crossword Support + +This branch adds support for downloading Seattle Times Midi crossword puzzles. + +## Status: URL Discovery Needed + +The implementation is complete structurally, but the actual API endpoint URLs need to be verified via browser inspection. + +## How to Complete the Implementation + +### Step 1: Discover the API URLs + +1. Open https://www.seattletimes.com/games-crossword-midi/ in Chrome/Firefox +2. Open Developer Tools (F12) +3. Go to the **Network** tab +4. Clear the network log +5. Reload the page and let the puzzle load +6. Look for requests to `amuselabs.com` domains +7. Find requests containing: + - `date-picker` - this is the picker URL + - `crossword` with query params - this is the puzzle URL +8. Note the exact URL patterns + +### Step 2: Update the Code + +Edit `src/xword_dl/downloader/seattletimesdownloader.py`: + +```python +# Update these URLs with the discovered patterns: +self.picker_url = "https://DISCOVERED_URL/date-picker?set=seattletimes-crossword-midi" +self.url_from_id = "https://DISCOVERED_URL/crossword?id={puzzle_id}&set=seattletimes-crossword-midi" + +# Update puzzle ID format based on what you see in the network traffic: +self.id = f"DISCOVERED_FORMAT-{url_formatted_date}" +``` + +### Step 3: Test + +```bash +# Install from this branch +uv tool install --force git+https://github.com/YOUR_USERNAME/xword-dl.git@feature/seattle-times-midi + +# Test latest puzzle +xword-dl stm --latest + +# Test specific date +xword-dl stm --date 5/1/26 +``` + +## Technical Details + +### Infrastructure +- Platform: AmuseLabs PuzzleMe (same as LA Times, Newsday) +- Puzzle Set: `seattletimes-crossword-midi` +- Base Domain: `seattletimes.amuselabs.com` (confirmed from page source) + +### Implementation +- Extends: `AmuseLabsDownloader` base class +- Command: `stm` +- Pattern: Similar to `LATimesDownloader` and `NewsdayDownloader` + +### Why URLs Are Unknown + +The Seattle Times website blocks direct API access (returns 403 Forbidden or 404 Not Found for automated requests). The puzzle loads via JavaScript in the browser, which makes the actual API calls. We need to observe those calls in a real browser to get the correct URL patterns. + +Common patterns attempted (all returned 404): +- `https://seattletimes.amuselabs.com/st/date-picker` +- `https://cdn3.amuselabs.com/st/date-picker` +- `https://seattletimes.amuselabs.com/puzzles/date-picker` + +## Once Working + +This will enable downloading: +- Seattle Times Midi crosswords (12x12 or 13x13 grids) +- Smaller puzzles suitable for mobile devices +- Historical puzzle archive +- Daily automated scraping + +## Contributing Back to Upstream + +Once the URLs are discovered and tested: +1. Commit the working changes +2. Add tests (if upstream requires them) +3. Submit PR to https://github.com/thisisparker/xword-dl diff --git a/scripts/push-to-github.sh b/scripts/push-to-github.sh new file mode 100755 index 0000000..d220302 --- /dev/null +++ b/scripts/push-to-github.sh @@ -0,0 +1,72 @@ +#!/bin/bash +# Quick setup script to push xword-dl fork to GitHub + +set -e + +echo "===============================================================" +echo "Seattle Times Midi - xword-dl Fork Setup" +echo "===============================================================" +echo "" +echo "This will push your xword-dl fork to GitHub: slmingol/xword-dl" +echo "" + +# Go to xword-dl directory +cd ~/dev/projects/xword-dl + +echo "Current branch:" +git branch | grep '*' +echo "" + +# Check if we're on the right branch +if ! git branch | grep -q '* feature/seattle-times-midi'; then + echo "ERROR: Not on feature/seattle-times-midi branch" + echo "Run: git checkout feature/seattle-times-midi" + exit 1 +fi + +echo "Step 1: Check if GitHub fork exists" +echo "-----------------------------------" +echo "Testing connection to github.com/slmingol/xword-dl..." +echo "" + +if git ls-remote slmingol &>/dev/null; then + echo "✓ Fork exists at github.com/slmingol/xword-dl" +else + echo "✗ Fork doesn't exist yet" + echo "" + echo "Please create the fork first:" + echo "1. Go to: https://github.com/thisisparker/xword-dl" + echo "2. Click 'Fork' button (top right)" + echo "3. Create fork under 'slmingol' account" + echo "" + read -p "Press ENTER once you've created the fork..." + echo "" +fi + +echo "Step 2: Push feature branch" +echo "---------------------------" +git push slmingol feature/seattle-times-midi + +echo "" +echo "===============================================================" +echo "SUCCESS!" +echo "===============================================================" +echo "" +echo "Your fork is now available at:" +echo " https://github.com/slmingol/xword-dl/tree/feature/seattle-times-midi" +echo "" +echo "Next steps:" +echo "" +echo "1. Build scraper with your fork:" +echo " cd ~/dev/projects/crossword-catastrophe" +echo " docker-compose build scraper" +echo "" +echo "2. The Dockerfile installs from your GitHub fork:" +echo " git+https://github.com/slmingol/xword-dl.git@feature/seattle-times-midi" +echo "" +echo "3. Deploy:" +echo " docker-compose up -d" +echo "" +echo "4. [Optional] Create PR to upstream xword-dl:" +echo " https://github.com/thisisparker/xword-dl/compare/main...slmingol:xword-dl:feature/seattle-times-midi" +echo "" diff --git a/src/xword_dl/downloader/seattletimesdownloader.py b/src/xword_dl/downloader/seattletimesdownloader.py new file mode 100644 index 0000000..36d7ae7 --- /dev/null +++ b/src/xword_dl/downloader/seattletimesdownloader.py @@ -0,0 +1,145 @@ +import datetime +import json + +from .amuselabsdownloader import AmuseLabsDownloader +from ..util import XWordDLException + + +class SeattleTimesMidiDownloader(AmuseLabsDownloader): + command = "stm" + outlet = "Seattle Times Midi" + outlet_prefix = "Seattle Times Midi" + + def __init__(self, **kwargs): + super().__init__(**kwargs) + + # Verified URLs from successful API testing + self.picker_url = "https://seattletimes.amuselabs.com/puzzleme/date-picker?set=seattletimes-crossword-midi" + self.url_from_id = ( + "https://seattletimes.amuselabs.com/puzzleme/crossword?id={puzzle_id}&set=seattletimes-crossword-midi" + ) + + def guess_date_from_puzzle_title(self, title): + # Seattle Times Midi puzzles have descriptive titles, not dates + # Date is stored separately in publication metadata + pass + + def find_by_date(self, dt): + """ + Seattle Times Midi puzzles use sequential IDs (midi-crossword-111, etc.) + rather than date-based IDs. We fetch the picker page first to check recent + puzzles in streakInfo, then fall back to ID enumeration for older puzzles. + """ + self.date = dt + + # Fetch the picker page to get the puzzle list + res = self.session.get(self.picker_url) + + # The picker page contains a JSON blob with puzzle metadata + # Extract it from the