Skip to content

Commit eee3ba0

Browse files
committed
add scraper
1 parent 35b2f18 commit eee3ba0

File tree

4 files changed

+625
-0
lines changed

4 files changed

+625
-0
lines changed

scraper/README.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# scraper
2+
3+
Simple tool to scrape all the data in [dd.meteo.gc.ca](https://dd.meteo.gc.ca).
4+
5+
## How to use
6+
7+
### Requirements
8+
9+
- **Node** for scraping
10+
- **Redis** for queuing and distribute work across multiple workers
11+
- **CouchDB** to index all the entries
12+
13+
### Usage
14+
15+
To start scraping `dd.meteo.gc.ca` from its root '/', add an entry in the Redis queue:
16+
`redis-cli -n 2 rpush url-0 /`
17+
Then you start the scraper with
18+
`COUCHDB_URL=http://username:password@localhost:5984 node scraper.js`

0 commit comments

Comments
 (0)