Skip to content

Conversation

nrbrd
Copy link
Member

@nrbrd nrbrd commented Jul 27, 2016

PR Overview

This PR adds the selector message. It depends on the #5, so old commits can be ignored in this PR (I'll cherry-pick the new ones later).

Example

I implemented an example to show the advantages of using selector_response. You can read it here: https://gist.github.com/aron-bordin/97ca4233b5a304cd1466c5322b358cc6
It's the same spider developed here: http://gsoc2016.readthedocs.io/en/latest/quickstart.html#source-code but it doesn't requires any HTML processor.

You pass the css/xpath filter in the request: https://gist.github.com/aron-bordin/97ca4233b5a304cd1466c5322b358cc6#file-dmoz-py-L21 and just use the resulting data: https://gist.github.com/aron-bordin/97ca4233b5a304cd1466c5322b358cc6#file-dmoz-py-L34

Docs:

You can read about the request here: http://gsoc2016.readthedocs.io/en/latest/protocol.html#selector-request
And response here: http://gsoc2016.readthedocs.io/en/latest/protocol.html#response-selector

WIP

This PR needs to be tested and implemented in the helper packages. So please, let me know if this is a good feature. If so, I'll implement the tests and implement it in the helper packages.

cc: @eLRuLL , @rolando

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant