Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 22 additions & 36 deletions .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,28 @@ on:
- 'feature/**'
## Any pull request. Yes the syntax looks weird
pull_request:
workflow_dispatch:

jobs:
# Only run one instance of this workflow at a time per branch.
concurrency:
group: ${{ github.ref_name }}-${{ github.workflow }}
cancel-in-progress: true

jobs:

test:
name: Test the loader on ${{matrix.operating-system}}
runs-on: ${{ matrix.operating-system }}
strategy:
matrix:
operating-system: [ubuntu-latest, windows-latest, macOS-latest]
operating-system: [ubuntu-latest, macOS-latest, windows-latest]

steps:
- name: Get the code
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Set up node
uses: actions/setup-node@v3
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
cache: 'npm'
Expand All @@ -48,9 +53,9 @@ jobs:
CI: true

- name: Archive test artifacts
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v4
with:
name: test-results
name: test-results-${{ matrix.operating-system }}
path: coverage


Expand All @@ -73,10 +78,10 @@ jobs:
steps:

- name: Get the code
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Set up node
uses: actions/setup-node@v3
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
cache: 'npm'
Expand All @@ -98,25 +103,20 @@ jobs:
{
\"pipeline\": {
\"source\": {
\"module\": \"./lib/sources/github-resource-source\",
\"module\": \"./lib/sources/filesystem-resource-source\",
\"config\": {
\"repoUrl\": \"https://github.com/nciocpl/r4r-content\",
\"branchName\": \"integration-testing\",
\"resourcesPath\": \"/resources\",
\"authentication\": {
\"token\": \"${GITHUB_TOKEN}\"
}
\"resourcesPath\": \"./integration-tests/data/resources\"
}
},
\"transformers\": [
{
\"module\": \"./lib/transformers/netlifymd-resource-transformer\",
\"config\": {
\"mappingUrls\": {
\"docs\": \"https://raw.githubusercontent.com/NCIOCPL/r4r-content/integration-testing/data/docs.json\",
\"researchAreas\": \"https://raw.githubusercontent.com/nciocpl/r4r-content/integration-testing/data/researchAreas.json\",
\"researchTypes\": \"https://raw.githubusercontent.com/nciocpl/r4r-content/integration-testing/data/researchTypes.json\",
\"toolTypes\": \"https://raw.githubusercontent.com/nciocpl/r4r-content/integration-testing/data/toolTypes.json\"
\"mappingFiles\": {
\"docs\": \"./integration-tests/data/data/docs.json\",
\"researchAreas\": \"./integration-tests/data/data/researchAreas.json\",
\"researchTypes\": \"./integration-tests/data/data/researchTypes.json\",
\"toolTypes\": \"./integration-tests/data/data/toolTypes.json\"
}
}
}
Expand Down Expand Up @@ -145,24 +145,10 @@ jobs:
## Variable is picked up by karate-config.js
export KARATE_ESHOST="http://localhost:${{ job.services.elasticsearch.ports[9200] }}"
cd integration-tests && ./bin/karate ./features
## Store the exit code off so we can pass this step and
## capture the test output in the next step, but still
## fail the entire job
echo "TEST_EXIT_CODE=$?" >> $GITHUB_ENV
exit 0

- name: Upload Integration test results
uses: actions/upload-artifact@v1
if: always()
uses: actions/upload-artifact@v4
with:
name: integration-test-results
path: integration-tests/target

- name: Fail build on bad tests
run: |
## Check if we had errors on the test step, and if so, fail the job
if [ $TEST_EXIT_CODE -ne 0 ]; then
echo "Tests Failed -- See Run Integration Test step or integration-test-results artifact for more information"
exit $TEST_EXIT_CODE
else
echo "Tests passed"
fi
2 changes: 1 addition & 1 deletion .nvmrc
Original file line number Diff line number Diff line change
@@ -1 +1 @@
lts/gallium
lts/iron
46 changes: 24 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Resources for Researchers Prototype Importer

## Requirements
* Node 8
* Node 20

## Running loader
1. Clone this repo
Expand All @@ -12,25 +12,21 @@ Resources for Researchers Prototype Importer
## Configuration Information
The configuration file is based on the https://github.com/NCIOCPL/loader-pipeline library. For the R4R Loader we have implemented the following pipeline steps:
### Source
* **GithubResourceSource** - This class pulls the content from a Github Repository
* **FileSystemResourceSource** - This class pulls the content from the local filesystem.
* Input: N/A
* Output: This returns an array of the *fetched* documents in git.
* Output: This returns an array of the raw documents.
* Configuration:
* `repoUrl` : (required) The git repo where the content resides.
* `resourcesPath`: (required) The path within the repo to the resources.
* `branchName` : (default: master) The branch to use.
* `authentication` : (optional) Git authentication configuration. We use token authentication (See https://octokit.github.io/rest.js/v18#authentication).
If no authentication is defined, then the source will use the public API, which has rate-limits based on IP source address.
* `resourcesPath`: (required) The path on the filesystem to the resource files.
### Transformers:
* **NetlifyMDResourceTransformer** - Transforms documents in Markdown with YML Front-matter format conforming to the r4r-content schema (https://github.com/NCIOCPL/r4r-content/blob/master/admin/config.yml) the
* Input: An array of documents that follow the r4r-content schema
* Output: A single record in a format expected by the ElasticResourceLoader
* Configuration:
* `mappingUrls` - (required) An object that contains the following properties:
* `docs` - (required) The URL to the docs taxonomy
* `researchAreas` - (required) The URL to the research areas taxonomy
* `researchTypes` - (required) The URL to the research types taxonomy
* `toolTypes` - (required) The URL to the tool types taxonomy
* `docs` - (required) The path to the docs taxonomy file.
* `researchAreas` - (required) The path to the research areas taxonomy file.
* `researchTypes` - (required) The path to the research types taxonomy file.
* `toolTypes` - (required) The path to the tool types taxonomy file.
### Loader:
* **ElasticResourceLoader** - Loads all records into an Elasticsearch index matching the format of \<aliasName\>\_YYYYMMDD\_HHMMSS. Upon successful completion
* Input: Documents in a format matching the elasticsearch mapping
Expand All @@ -54,7 +50,7 @@ The configuration file is based on the https://github.com/NCIOCPL/loader-pipelin
1. Install Coverage Gutters extension
2. Install Jest extension
4. Setup a local configuration
1. create a local.json file in the <importer_root>/config directory
1. create a local.json file in the `<importer_root>/config` directory
2. This file is used to override the default.json options and should look something like:
```
{
Expand All @@ -63,22 +59,28 @@ The configuration file is based on the https://github.com/NCIOCPL/loader-pipelin
},
"pipeline": {
"source": {
"module": "./lib/sources/github-resource-source",
"module": "./lib/sources/filesystem-resource-source",
"config": {
"repoUrl": "https://github.com/nciocpl/r4rcontent",
"resourcesPath": "/resources",
"branchName": "<YOUR_BRANCH>",
"authentication": {
"type": "token",
"token": "<YOUR_AUTH_TOKEN>"
}
"resourcesPath": "<PATH TO RESOURCE FILES>",
}
},
"transformers": [
{
"module": "./lib/transformers/netlifymd-resource-transformer",
"config": {
"mappingFiles": {
"docs": "<PATH TO DOCS MAPPING FILE>",
"researchAreas": "<PATH TO researchAreas MAPPING FILE>",
"researchTypes": "<PATH TO researchTypes MAPPING FILE>",
"toolTypes": "<PATH TO toolTypes MAPPING FILE>"
}
}
}
],
"loader": {
"module": "lib/loaders/elastic-resource-loader",
"config": {
"eshosts": [ "<THE REAL DEV SERVER>" ],
//"eshosts": [ "http://localhost:9200" ],
"daysToKeep": 10,
"aliasName": "r4r_v1",
"mappingPath": "es-mappings/mappings.json",
Expand Down
18 changes: 7 additions & 11 deletions config/default.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,20 @@
},
"pipeline": {
"source": {
"module": "./lib/sources/github-resource-source",
"module": "./lib/sources/filesystem-resource-source",
"config": {
"repoUrl": "https://github.com/nciocpl/r4r-content",
"resourcesPath": "/resources",
//"authentication": {
// "token": "SECRET"
//}
"resourcesPath": "../resources"
}
},
"transformers": [
{
"module": "./lib/transformers/netlifymd-resource-transformer",
"config": {
"mappingUrls": {
"docs": "https://raw.githubusercontent.com/nciocpl/r4r-content/master/data/docs.json",
"researchAreas": "https://raw.githubusercontent.com/nciocpl/r4r-content/master/data/researchAreas.json",
"researchTypes": "https://raw.githubusercontent.com/nciocpl/r4r-content/master/data/researchTypes.json",
"toolTypes": "https://raw.githubusercontent.com/nciocpl/r4r-content/master/data/toolTypes.json"
"mappingFiles": {
"docs": "../mapping/docs.json",
"researchAreas": "../mapping/researchAreas.json",
"researchTypes": "../mapping/researchTypes.json",
"toolTypes": "../mapping/toolTypes.json"
}
}
}
Expand Down
45 changes: 0 additions & 45 deletions config/default.test

This file was deleted.

82 changes: 82 additions & 0 deletions integration-tests/bin/load-integration-data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
#!/bin/bash

set -e

## Use the Environment var or the default
if [[ -z "${ELASTIC_SEARCH_HOST}" ]]; then
ELASTIC_HOST="http://localhost:9200"
else
ELASTIC_HOST=${ELASTIC_SEARCH_HOST}
fi

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

## Wait until docker is up.
echo "Waiting for ES Service at ${ELASTIC_HOST} to Start"
until $(curl --output /dev/null --silent --head --fail "${ELASTIC_HOST}"); do
printf '.'
sleep 1
done
echo "ES Service is up"

# First wait for ES to start...
response=$(curl --write-out %{http_code} --silent --output /dev/null "${ELASTIC_HOST}")

until [ "$response" = "200" ]; do
response=$(curl --write-out %{http_code} --silent --output /dev/null "${ELASTIC_HOST}")
>&2 echo "Elastic Search is unavailable - sleeping"
sleep 1
done

# next wait for ES status to turn to Green
health_check="curl -fsSL ${ELASTIC_HOST}/_cat/health?h=status"
health=$(eval $health_check)
echo "Waiting for ES status to be ready"
until [[ "$health" = 'green' ]]; do
>&2 echo "Elastic Search is unavailable - sleeping"
sleep 10
health=$(eval $health_check)
done
echo "ES status is green"

pushd $(dirname $(dirname $DIR))
echo "Load the index mapping and data"


export NODE_CONFIG=" \
{ \
\"pipeline\": { \
\"source\": { \
\"module\": \"./lib/sources/filesystem-resource-source\", \
\"config\": { \
\"resourcesPath\": \"./integration-tests/data/resources\" \
} \
}, \
\"transformers\": [ \
{ \
\"module\": \"./lib/transformers/netlifymd-resource-transformer\", \
\"config\": { \
\"mappingFiles\": { \
\"docs\": \"./integration-tests/data/data/docs.json\", \
\"researchAreas\": \"./integration-tests/data/data/researchAreas.json\", \
\"researchTypes\": \"./integration-tests/data/data/researchTypes.json\", \
\"toolTypes\": \"./integration-tests/data/data/toolTypes.json\" \
} \
} \
} \
], \
\"loader\": { \
\"module\": \"./lib/loaders/elastic-resource-loader\", \
\"config\": { \
\"eshosts\": [ \"http://localhost:9200\" ], \
\"daysToKeep\": 10, \
\"aliasName\": \"r4r_v1\", \
\"mappingPath\": \"es-mappings/mappings.json\", \
\"settingsPath\": \"es-mappings/settings.json\" \
} \
} \
} \
} \
"
node index.js
popd
1 change: 1 addition & 0 deletions integration-tests/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# r4rcontent
17 changes: 17 additions & 0 deletions integration-tests/data/_prose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
prose:
rooturl: '_resources'
#siteurl: 'http://example.org/r4r'
ignore:
- README.md
- _prose.yml
metadata:
_resources:
- name: 'id'
field:
element: number
type: number
- name: 'title'
field:
element: "text"
help: "Enter the title of the resource"
type: "text"
Loading