Skip to content

Commit

Permalink
Merge pull request #8 from JackTheMico/release
Browse files Browse the repository at this point in the history
Release to main branch
  • Loading branch information
JackTheMico authored Sep 1, 2022
2 parents 78967f9 + c4087e8 commit c8066e6
Show file tree
Hide file tree
Showing 13 changed files with 2,107 additions and 304 deletions.
82 changes: 70 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,27 +7,43 @@ A [Ruia](https://github.com/howie6879/ruia) plugin that uses [peewee-async](http

## Installation

Using [pip](https://pip.pypa.io/en/stable/) or [ pipenv ](https://pipenv.pypa.io/en/latest/) or [ poetry ](https://python-poetry.org/) to install.

```shell
pip install ruia-peewee-async[aiomysql]
pipenv install ruia-peewee-async[aiomysql]
poetry add ruia-peewee-async[aiomysql]

or

pip install ruia-peewee-async[aiopg]
pipenv install ruia-peewee-async[aiopg]
poetry add ruia-peewee-async[aiopg]

or

pip install ruia-peewee-async[all]
pipenv install ruia-peewee-async[all]
poetry install ruia-peewee-async[all]
```
`ruia-peewee-async[all]` means to install both aiomysql and aiopg.

## Usage


A complete example is like below.
```python
# -*- coding: utf-8 -*-
from peewee import CharField
from ruia import AttrField, Item, Response, Spider, TextField
from ruia import AttrField, Item, Response, TextField

from ruia_peewee_async import (
RuiaPeeweeInsert,
RuiaPeeweeUpdate,
Spider,
TargetDB,
after_start,
)

from ruia_peewee_async import (RuiaPeeweeInsert, RuiaPeeweeUpdate, TargetDB,
after_start)

class DoubanItem(Item):
target_item = TextField(css_select="tr.item")
Expand All @@ -37,6 +53,7 @@ class DoubanItem(Item):
async def clean_title(self, value):
return value.strip()


class DoubanSpider(Spider):
start_urls = ["https://movie.douban.com/chart"]
# aiohttp_kwargs = {"proxy": "http://127.0.0.1:7890"}
Expand All @@ -47,6 +64,7 @@ class DoubanSpider(Spider):
# yield RuiaPeeweeInsert(item.results, database=TargetDB.POSTGRES) # save to Postgresql
# yield RuiaPeeweeInsert(item.results, database=TargetDB.BOTH) # save to both MySQL and Postgresql


class DoubanUpdateSpider(Spider):
start_urls = ["https://movie.douban.com/chart"]

Expand All @@ -63,10 +81,12 @@ class DoubanUpdateSpider(Spider):

# Args for RuiaPeeweeUpdate
# data: A dict that's going to be updated in the database.
# query: A peewee query or a dict to search for the target data in database.
# query: A peewee's query or a dict to search for the target data in database.
# database: The target database type.
# create_when_not_exists: If True, will create a record when data not exists. Default is True.
# only: A list or tuple of fields that should be updated.
# create_when_not_exists: Default is True. If True, will create a record when query can't get the record.
# not_update_when_exists: Default is True. If True and record exists, won't update data to the records.
# only: A list or tuple of fields that should be updated only.


mysql = {
"host": "127.0.0.1",
Expand Down Expand Up @@ -100,18 +120,56 @@ if __name__ == "__main__":
# DoubanUpdateSpider.start(after_start=after_start(mysql=mysql))
```

There's a `create_model` method to create the Peewee model based on database configuration.
```python
from ruia_peewee_async import create_model

model = create_model(mysql=mysql) # or postgres=postgres or both
# create the table at the same time
model = create_mode(postgres=postgres, create_table=True)
rows = model.select().count()
print(rows)
```

And class `Spider` from `ruia_peewee_async` has attributes below related to database you can use.
```python
from peewee import Model
from typing import Dict
from peewee_async import Manager, MySQLDatabase, PostgresqlDatabase
from ruia import Spider as RuiaSpider

class Spider(RuiaSpider):
mysql_model: Union[Model, Dict] # It will be a Model instance after spider started.
mysql_manager: Manager
postgres_model: Union[Model, Dict] # same above
postgres_manager: Manager
mysql_db: MySQLDatabase
postgres_db: PostgresqlDatabase
```
For more information, check out [peewee's documentation](http://docs.peewee-orm.com/en/latest/) and [peewee-async's documentation](https://peewee-async.readthedocs.io/en/latest/).

## Development
Using `pyenv` to install the version of python that you need.
For example
```shell
pyenv install 3.7.15
pyenv install 3.7.9
```
Then go to the root of the project and run:
```shell
poetry install
poetry install && poetry install -E aiomysql -E aiopg
```
to install all dependencies.

Using `poetry shell` to enter the virtual environment. Or open your favorite editor and select the virtual environment to start coding.

Using `pytest` to run unit tests under `tests` folder.
- Using `poetry shell` to enter the virtual environment.
Or open your favorite editor and select the virtual environment to start coding.
- Using `pytest` to run unit tests under `tests` folder.
- Using `pytest --cov .` to run all tests and generate coverage report in terminal.

## Thanks
- [ruia](https://github.com/howie6879/ruia)
- [peewew](https://github.com/coleifer/peewee)
- [peewee-async](https://github.com/05bit/peewee-async)
- [aiomysql](https://github.com/aio-libs/aiomysql)
- [aiopg](https://github.com/aio-libs/aiopg)
- [schema](https://github.com/keleshev/schema)
- [pytest and its awesome plugins](https://github.com/pytest-dev/pytest)
17 changes: 12 additions & 5 deletions examples/douban.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
# -*- coding: utf-8 -*-
from peewee import CharField
from ruia import AttrField, Item, Response, Spider, TextField
from ruia import AttrField, Item, Response, TextField

from ruia_peewee_async import RuiaPeeweeInsert, RuiaPeeweeUpdate, TargetDB, after_start
from ruia_peewee_async import (
RuiaPeeweeInsert,
RuiaPeeweeUpdate,
Spider,
TargetDB,
after_start,
)


class DoubanItem(Item):
Expand Down Expand Up @@ -41,10 +47,11 @@ async def parse(self, response: Response):

# Args for RuiaPeeweeUpdate
# data: A dict that's going to be updated in the database.
# query: A peewee query or a dict to search for the target data in database.
# query: A peewee's query or a dict to search for the target data in database.
# database: The target database type.
# create_when_not_exists: If True, will create a record when data not exists. Default is True.
# only: A list or tuple of fields that should be updated.
# create_when_not_exists: Default is True. If True, will create a record when query can't get the record.
# not_update_when_exists: Default is True. If True and record exists, won't update data to the records.
# only: A list or tuple of fields that should be updated only.


mysql = {
Expand Down
29 changes: 28 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ ruia = "^0.8.4"
peewee-async = "^0.8.0"
aiomysql = {version = "^0.1.1", optional = true}
aiopg = {version = "^1.3.4", optional = true}
schema = "^0.7.5"

[tool.poetry.dev-dependencies]
pytest = "^7.1.2"
Expand Down
Loading

0 comments on commit c8066e6

Please sign in to comment.