In this example, we use CocoIndex Custom Source to define a source to get HackerNews recent content, by calling HackerNews API. We build index for HackerNews threads and their comments, and provides a lightweight query handler to search by keywords.
We appreciate a star ⭐ at CocoIndex Github if this is helpful.
- We define a custom source connector
HackerNewsto get HackerNews recent threads by calling HackerNews API. - We build index for HackerNews threads and their comments.
Install Postgres if you don't have one.
Install dependencies:
pip install -e .Update the target:
cocoindex update mainEach time when you run the update command, cocoindex will only re-process threads that have changed, and keep the target in sync with the recent 500 threads from HackerNews.
You can also run update command in live mode, which will keep the target in sync with the source continuously:
cocoindex update -L mainI used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline. It just connects to your local CocoIndex server, with Zero pipeline data retention. Run following command to start CocoInsight:
cocoindex server -ci mainThen open the CocoInsight UI at https://cocoindex.io/cocoinsight.