You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/pages/apis.mdx
+4-2Lines changed: 4 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,7 @@
1
1
# API Examples
2
-
The repository already includes API pipeline manifest definitions for generating knowledge bases from several REST APIs. Each demonstrates how to define a YAML manifest for extracting data from target API endpoints using different Authentication/Pagination strategies. For a more in-depth review of how to build a manifest for creating a RAG pipeline for your own API, visit [Defining the API Pipeline Manifest](/manifest-definition).
2
+
The repository already includes a few API pipeline manifest definitions that showcase how to use the `rag-api-pipeline` for generating knowledge bases from REST APIs.
3
+
Each example demonstrates how to define a YAML manifest for extracting data from target API endpoints using different Authentication/Pagination strategies.
4
+
For a more in-depth review of how to build a manifest for creating a RAG pipeline for your own API, remember to visit [Defining the API Pipeline Manifest](/manifest-definition) section.
3
5
4
6
## Boardroom Governance API
5
7
[Boardroom](https://boardroom.io/) offers its `Boardrooms Governance API` to provide comprehensive data on 350+ DAOs across chains. It offers endpoints that fetch information about proposals, delegates, discussions, and much more. You can find the complete API documentation at this [link](https://docs.boardroom.io/docs/api/cd5e0c8aa2bc1-overview).
@@ -12,4 +14,4 @@ The [Agora](https://www.agora.xyz/#Product) OP API provides various endpoints to
12
14
Check the [Agora API](/apis/agora-api) section for details on how to extract data from the API and generate a knowledge base related to RetroPGF projects and proposals within the OP collective.
13
15
14
16
## Working with Other APIs
15
-
If you are interested in working with any other API, visit the [API Examples](/apis/other-api-sources) section to get started.
17
+
If you are interested in working with any other API, visit the [Other API Sources](/apis/other-api-sources) section to get started.
This repository contains the [OpenAPI specification](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_openapi.yaml) and [API pipeline manifest](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_api_pipeline.yaml) needed to create a RAG pipeline. This pipeline generates a knowledge base from RetroPGF projects and proposals within the OP collective. These files are typically located in the `config` folder.
3
+
This repository contains the [OpenAPI specification](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_openapi.yaml) and [API pipeline manifest](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_api_pipeline.yaml) needed to create a RAG pipeline.
4
+
This pipeline generates a knowledge base from RetroPGF projects and proposals within the OP collective.
4
5
5
-
To access this API, you'll need an API key. You can request one through [Agora's Discord server](https://www.agora.xyz/#Product). Once obtained, store the key in `config/secrets/api-key` or provide it directly using the `--api-key` CLI argument.
6
+
## Pre-requisites
6
7
7
-
## API Pipeline Manifest - Overview
8
+
To access this API, you'll need an API key. You can request one through the [Agora's Discord server](https://www.agora.xyz/#Product). You can run the `rag-api-pipeline setup` command to set the REST API Key,
9
+
or your can directly store the key in the `config/secrets/api-key` file. A less secure option is to provide it using the `--api-key` CLI argument.
8
10
9
-
The API pipeline extracts data from the `/proposals` and `/projects`[endpoints](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_api_pipeline.yaml#L79). Since no `api_parameters` are required, this section remains [empty](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/agora_api_pipeline.yaml#L5).
11
+
## Getting the Agora API OpenAPI Spec
10
12
11
-
Below is the requester definition. The API implements a BearerAuthenticator schema and retrieves the `api_token` from the `config` object:
13
+
TODO:
12
14
13
-
```yaml
14
-
# agora_api_pipeline.yaml
15
+
## Defining the RAG API Pipeline Manifest
16
+
17
+
This pipeline will extract data related to DAO proposals (`/proposals`) and RetroPGF projects (`/projects`).
18
+
Next, you can find an overview of the main sections in the API pipeline manifest.
19
+
20
+
### Basic Configuration
21
+
22
+
Since no `api_parameters` are required, this section remains empty.
23
+
24
+
```yaml [agora_api_pipeline.yaml]
25
+
api_name: "optimism_agora_api"
26
+
27
+
api_parameters:
28
+
29
+
api_config:
30
+
request_method: "get"
31
+
content_type: "application/json"
32
+
response_entrypoint_field: "data"
33
+
```
34
+
35
+
### Connector Specification
36
+
37
+
The manifest then defines some metadata and the request parameters needed for making calls to the API. In this case, it only needs an `api_key`
Below is the `requester_base` definition. The API implements a BearerAuthenticator schema and retrieves the `api_token` from the `config` object:
61
+
62
+
```yaml [agora_api_pipeline.yaml]
15
63
definition:
16
64
requester_base:
17
65
type: HttpRequester
@@ -22,10 +70,11 @@ definition:
22
70
api_token: "{{ config['api_key'] }}"
23
71
```
24
72
73
+
### Record Selection and Pagination
74
+
25
75
The API uses an Offset-based pagination strategy. The `page_size` is set to 50, while `offset` and `limit` parameters are dynamically inserted into the URL as request parameters:
26
76
27
-
```yaml
28
-
# agora_api_pipeline.yaml
77
+
```yaml [agora_api_pipeline.yaml]
29
78
definition:
30
79
paginator: # Details at https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/pagination
31
80
type: DefaultPaginator
@@ -42,12 +91,41 @@ definition:
42
91
field_name: "limit"
43
92
```
44
93
45
-
## Generating a Knowledge Base Using the `rag-api-pipeline` CLI
94
+
### Endpoint Configuration
95
+
96
+
Below are the target endpoints with their respective schemas:
46
97
47
-
Before running the `run-all` command, ensure that `Ollama` is running locally with your preferred LLM embeddings model:
98
+
```yaml [agora_api_pipeline.yaml]
99
+
endpoints:
100
+
/proposals:
101
+
id: "proposals"
102
+
primary_key: "id"
103
+
responseSchema: "#/schemas/Proposal"
104
+
textSchema:
105
+
$ref: "#/textSchemas/Proposal"
106
+
/projects:
107
+
id: "projects"
108
+
primary_key: "id"
109
+
responseSchema: "#/schemas/Project"
110
+
textSchema:
111
+
$ref: "#/textSchemas/Project"
112
+
```
113
+
114
+
## Using the RAG Pipeline to generate a Knowledge Base for the OP Collective
115
+
116
+
### RAG Pipeline CLI
117
+
118
+
1. Make sure to setup the pipeline initial settings by running the `rag-api-pipeline setup` command.
119
+
2. Execute the following command:
48
120
49
121
```bash
50
-
poetry run rag-api-pipeline run-all config/agora_api_pipeline.yaml --openapi-spec-file config/agora_openapi.yaml --llm-provider ollama
After execution, you'll find a compressed knowledge base snapshot in `{OUTPUT_FOLDER}/optimism_agora_api/` named `optimism_agora_api_collection-xxxxxxxxxxxxxxxx-yyyy-mm-dd-hh-mm-ss.snapshot.tar.gz`. For instructions on importing this into your Gaianet node, refer to the documentation on [selecting a knowledge base](https://docs.gaianet.ai/node-guide/customize#select-a-knowledge-base). Find recommended prompts and node configuration settings [here](/cli/node-deployment#recommended-gaianet-node-configuration).
125
+
After execution, you'll find the processed data and compressed knowledge base snapshot in the `output/optimism_agora_api` folder.
126
+
127
+
### Import the KB Snapshot into a Gaianet Node
128
+
129
+
1. Locate the generated snapshot in `output/optimism_agora_api/` (named `optimism_agora_api_collection-xxxxxxxxxxxxxxxx-yyyy-mm-dd-hh-mm-ss.snapshot.tar.gz`) or download it from the HuggingFace link above.
130
+
2. Follow the official [knowledge base selection guide](https://docs.gaianet.ai/node-guide/customize#select-a-knowledge-base)
131
+
3. Configure your node using the recommended settings from the [node deployment guide](/cli/node-deployment#recommended-gaianet-node-configuration)
Copy file name to clipboardExpand all lines: docs/pages/apis/boardroom-api.mdx
+59-31Lines changed: 59 additions & 31 deletions
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,26 @@
1
1
# Boardroom Governance API
2
2
3
-
This repository contains the [OpenAPI specification](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_openapi.yaml) and the [API pipeline manifest](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_api_pipeline.yaml) needed to create a RAG pipeline. This pipeline generates a knowledge base from any DAO/Protocol hosted by the Boardroom Governance API. All configuration files are located in the `config` folder.
3
+
The repository already contains the [OpenAPI specification](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_openapi.yaml) and the [API pipeline manifest](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_api_pipeline.yaml) needed to create a RAG API pipeline.
4
+
This pipeline generates a knowledge base from any DAO/Protocol hosted by the Boardroom Governance API.
4
5
5
-
## Prerequisites
6
+
## Pre-requisites
6
7
7
-
To use this API, you'll need an API key. Request one from [Boardroom's developer portal](https://boardroom.io/developers/billing). Store the key in `config/secrets/api-key` or provide it directly using the `--api-key` CLI argument.
8
+
To use this API, you'll need an API key. Request one from [Boardroom's developer portal](https://boardroom.io/developers/billing). You can run the `rag-api-pipeline setup` command to set the REST API Key,
9
+
or your can directly store the key in the `config/secrets/api-key` file. A less secure option is to provide it using the `--api-key` CLI argument.
8
10
9
-
## API Pipeline Manifest Overview
11
+
## Getting the Boardroom API OpenAPI Spec
12
+
13
+
TODO:
14
+
15
+
## Defining the RAG API Pipeline Manifest
16
+
17
+
This pipeline will extract data related to protocol metadata (`/protocols/aave`), DAO proposals (`/protocols/aave/proposals`) and discussion posts from the Discourse forum site (`discourseTopics`, `discourseCategories` and `discourseTopicPosts`) if there's any.
10
18
11
19
### Basic Configuration
12
20
13
-
The manifest begins with defining the API nameand parameters. This example uses the [Aave Governance DAO](https://boardroom.io/aave/insights):
21
+
The manifest starts by defining the API name, parameters and requests settings. You can visit this [link](https://docs.boardroom.io/docs/api/5b445a81af241-get-all-protocols) to get the list of all DAO protocols in Boardroom. This example focuses on the [Aave Governance DAO](https://boardroom.io/aave/insights):
14
22
15
-
```yaml
23
+
```yaml [boardroom_api_pipeline.yaml]
16
24
api_name: "aave_boardroom_api"
17
25
18
26
api_parameters:
@@ -27,9 +35,9 @@ api_config:
27
35
28
36
### Connector Specification
29
37
30
-
The manifest defines parameters required for API requests:
38
+
The manifest then defines some metadata and the request parameters needed for making calls to the API:
The `requester_base` defines how to interact with the API:
74
+
Then, the `requester_base` defines the how connector should make requests to the API. Here, an `ApiKeyAuthenticator` schema is required and gets the `api_token` value from the `config` object:
67
75
68
-
```yaml
76
+
```yaml [boardroom_api_pipeline.yaml]
69
77
definitions:
70
78
requester_base:
71
79
type: HttpRequester
@@ -82,9 +90,9 @@ definitions:
82
90
83
91
### Record Selection and Pagination
84
92
85
-
Data records are wrapped in the `data` field:
93
+
Data records returned by the API are always wrapped in the `data` field, while pagination is handled using a Cursor-based approach:
86
94
87
-
```yaml
95
+
```yaml [boardroom_api_pipeline.yaml]
88
96
definitions:
89
97
selector:
90
98
type: RecordSelector
@@ -106,9 +114,9 @@ definitions:
106
114
107
115
### Endpoint Configuration
108
116
109
-
Define endpoints with their respective schemas. Example for proposals endpoint:
117
+
Now it's time to define the target endpoints with their respective schemas. Below is an example for the *proposals* endpoint:
110
118
111
-
```yaml
119
+
```yaml [boardroom_api_pipeline.yaml]
112
120
endpoints:
113
121
"/protocols/{cname}/proposals":
114
122
id: "proposals"
@@ -120,9 +128,9 @@ endpoints:
120
128
121
129
### Schema Definitions
122
130
123
-
The `responseSchema` defines the complete data structure:
131
+
The `responseSchema` reference from above defines the complete *unwrappd* data schema that is returned by the API endpoint:
124
132
125
-
```yaml
133
+
```yaml [boardroom_api_pipeline.yaml]
126
134
schemas:
127
135
Proposals:
128
136
type: object
@@ -205,9 +213,10 @@ schemas:
205
213
type: integer
206
214
```
207
215
208
-
The `textSchema` specifies fields for text parsing. Note that all properties must be listed in the `responseSchema`. In this case, `title`, `content`, and `summary` will be parsed as texts, while other fields will be included as metadata properties in a JSON object:
216
+
On the other hand, the endpoint's `textSchema` reference specifies the list of fields for text parsing. Note that all properties are also listed in the `responseSchema`.
217
+
In this case, `title`, `content`, and `summary` will be parsed as texts, while other fields will be included as metadata properties in a JSON object:
209
218
210
-
```yaml
219
+
```yaml [boardroom_api_pipeline.yaml]
211
220
textSchemas:
212
221
Proposal:
213
222
type: object
@@ -222,9 +231,9 @@ textSchemas:
222
231
223
232
### Chunking Parameters
224
233
225
-
Configure text chunking behavior:
234
+
This section set the settings to be used when applying text chunking to the extracted content:
226
235
227
-
```yaml
236
+
```yaml [boardroom_api_pipeline.yaml]
228
237
chunking_params:
229
238
mode: "elements"
230
239
chunking_strategy: "by_title"
@@ -237,25 +246,44 @@ chunking_params:
237
246
multipage_sections: true
238
247
```
239
248
240
-
## Usage Guide
249
+
## Using the RAG Pipeline to generate a Knowledge Base for Aave
241
250
242
-
### Generating a Knowledge Base
251
+
### RAG Pipeline CLI
243
252
244
-
1. Ensure `Ollama` is running locally with your preferred LLM embeddings model
245
-
2. Run the following command:
253
+
1. Make sure to setup the pipeline initial settings by running the `rag-api-pipeline setup` command.
254
+
2. Execute the following command:
246
255
247
256
```bash
248
-
poetry run rag-api-pipeline run-all config/boardroom_api_pipeline.yaml --openapi-spec-file config/openapi.yaml --llm-provider ollama
The processed data and knowledge base snapshot for Aave are available on [Hugging Face](https://huggingface.co/datasets/uxman/aave_snapshot_boardroom/tree/main).
260
+
The processed data and knowledge base snapshot for Aave will be available in the `output/aave_boardroom_api` folder. You can also find a public knowledge base snapshot on [Hugging Face](https://huggingface.co/datasets/uxman/aave_snapshot_boardroom/tree/main).
261
+
262
+
### Import the KB Snapshot into a Gaianet Node
263
+
264
+
1. Locate the generated snapshot in `output/aave_boardroom_api/` (named `aave_boardroom_api_collection-xxxxxxxxxxxxxxxx-yyyy-mm-dd-hh-mm-ss.snapshot.tar.gz`) or download it from the HuggingFace link above.
265
+
2. Follow the official [knowledge base selection guide](https://docs.gaianet.ai/node-guide/customize#select-a-knowledge-base)
266
+
3. Configure your node using the recommended settings from the [node deployment guide](/cli/node-deployment#recommended-gaianet-node-configuration)
267
+
268
+
Once the command above finishes, you'll find a compressed knowledge base snapshot in
269
+
`{OUTPUT_FOLDER}/aave_boardroom_api/`with name aave_boardroom_api_collection-xxxxxxxxxxxxxxxx-yyyy-mm-dd-hh-mm-ss.snapshot.tar.gz`. Now it's time to import it
270
+
into your gaianet node. You can find the instructions on how to select a knowledge base [here](https://docs.gaianet.ai/node-guide/customize#select-a-knowledge-base).
271
+
The recommended prompts and node config settings can be found [here](/cli/node-deployment#recommended-gaianet-node-configuration).
272
+
273
+
### Example user prompts
274
+
275
+
- Asking what information the RAG bot is able to provide
- Asking for information about the proposal [Enable Metis as Collateral on the Metis Chain](https://boardroom.io/aave/proposal/cHJvcG9zYWw6YWF2ZTpvbmNoYWluLXVwZ3JhZGU6MTUy)
- Asking for information about [Onboarding USDS and sUSDS to Aave v3](https://boardroom.io/aave/discussions/18987)
254
284
255
-
1. Locate the generated snapshot in `{OUTPUT_FOLDER}/aave_boardroom_api/` (named `aave_boardroom_api_collection-xxxxxxxxxxxxxxxx-yyyy-mm-dd-hh-mm-ss.snapshot.tar.gz`)
256
-
2. Follow the [knowledge base selection guide](https://docs.gaianet.ai/node-guide/customize#select-a-knowledge-base)
257
-
3. Configure using the recommended settings from the [node deployment guide](/cli/node-deployment#recommended-gaianet-node-configuration)
To generate a knowledge base for a different DAO, modify the `api_name` and `api_parameters` in the [boardroom_api_pipeline.yaml](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_api_pipeline.yaml) file.
289
+
To generate a knowledge base for a different DAO, you just need to modify the `api_name` and `api_parameters` values in the [boardroom_api_pipeline.yaml](https://github.com/raid-guild/gaianet-rag-api-pipeline/blob/main/config/boardroom_api_pipeline.yaml) manifest file.
0 commit comments