Skip to content

BBMRI-ERIC/finder-onboarding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

How to Join the Finder

This guide explains how to connect your biobank to the Finder, a federated search platform operated by BBMRI-ERIC. By following these steps, you will install a small local software component called Bunny on your own server. Bunny securely answers queries from the Finder based on data in your local OMOP database.

Who is this guide for?
This guide is written for biobank staff who may not be deep technical experts but are comfortable running commands on a server and editing configuration files. If you have a system administrator or IT contact, feel free to share this guide with them.


What You Will Install

The setup consists of three parts, all running inside Docker containers:

Component Purpose
Bunny The bridge between your database and the Finder. It fetches queries from the Finder, runs them against your OMOP database, and returns privacy-protected results.
PostgreSQL The database that holds your OMOP data.
OMOP-Lite A helper service that can load OMOP vocabularies into PostgreSQL (optional but recommended during setup).

Bunny does not send raw patient data to the Finder. It only returns aggregated, anonymised counts, with small numbers suppressed and values rounded to protect privacy.


Before You Start

1. Contact BBMRI-ERIC

Send an email to federated-platform@helpdesk.bbmri-eric.eu with your request to join the Finder. BBMRI-ERIC will:

  • Create an account for your biobank.
  • Provide you with credentials for the Finder task API.
  • Give you a Collection ID that identifies your biobank in the Finder.

Keep these details safe — you will need them later.

2. Check Your Server Requirements

You will need a Linux server (virtual or physical) with:

  • Docker installed (version 20.10 or newer recommended).
  • Docker Compose installed (the Compose plugin for Docker, or the standalone docker-compose command).
  • Network access to the Finder endpoint over HTTPS (https://finder-dev.bbmri-eric.eu).
  • Access to your OMOP PostgreSQL database, either:
    • already running elsewhere, or
    • running in the same Docker setup as shown below.

If your OMOP database is already managed by another team, you only need to run Bunny and point it at the existing database. In that case, you can remove the postgres and omop-lite services from the examples below.


Step-by-Step Installation

Step 1: Create a Project Folder

On your server, create a folder to hold the configuration files:

mkdir /opt/bunny
cd /opt/bunny

All commands in this guide assume you are working from this folder.

Step 2: Create the Environment File

Create a file named .env in the project folder. This file keeps sensitive settings and version numbers separate from the main configuration.

nano .env

Copy the following content into .env and replace every value marked with REPLACE_WITH_... with the real values you received from BBMRI-ERIC or chose for your local database.

# Software versions
BUNNY_VERSION=1.7.0
POSTGRES_VERSION=16-alpine

# OMOP database settings
OMOP_DB_NAME=omop
OMOP_DB_USERNAME=postgres
# ⚠️ Change this to a strong password for production use.
OMOP_DB_PASSWORD=REPLACE_WITH_A_STRONG_PASSWORD
OMOP_DB_SCHEMA=public
POSTGRES_HOST_PORT=5432

# Finder connection settings (provided by BBMRI-ERIC)
TASK_API_BASE_URL=https://finder-dev.bbmri-eric.eu/api/api/v1
TASK_API_USERNAME=REPLACE_WITH_BBMRI_ERIC_USERNAME
TASK_API_PASSWORD=REPLACE_WITH_BBMRI_ERIC_PASSWORD

# Your biobank's identifier in the Finder (provided by BBMRI-ERIC)
COLLECTION_ID=REPLACE_WITH_BBMRI_ERIC_COLLECTION_ID

# How often Bunny checks for new queries, in seconds
POLLING_INTERVAL=5

# Privacy protection settings
LOW_NUMBER_SUPPRESSION_THRESHOLD=5
ROUNDING_TARGET=10

What these settings mean

Setting Explanation
BUNNY_VERSION The version of Bunny to download. Only change this when upgrading.
POSTGRES_VERSION The version of PostgreSQL to use for the local OMOP database.
OMOP_DB_NAME The name of the OMOP database inside PostgreSQL.
OMOP_DB_USERNAME / OMOP_DB_PASSWORD The username and password Bunny uses to connect to PostgreSQL. Choose a strong password if this is a production database.
OMOP_DB_SCHEMA The database schema that contains your OMOP tables. The default is public.
POSTGRES_HOST_PORT The port on your server that maps to PostgreSQL inside the container. 5432 is the standard PostgreSQL port.
TASK_API_BASE_URL The Finder address Bunny contacts to fetch queries and return results. Use the URL provided by BBMRI-ERIC.
TASK_API_USERNAME / TASK_API_PASSWORD Your Finder API credentials. Treat these like a password.
COLLECTION_ID The unique identifier for your biobank/collection in the Finder.
POLLING_INTERVAL How often Bunny asks the Finder for new tasks. 5 seconds is a sensible default.
LOW_NUMBER_SUPPRESSION_THRESHOLD Counts smaller than this number are hidden in results to prevent re-identification. Default: 5.
ROUNDING_TARGET Counts are rounded to the nearest multiple of this number before being returned. Default: 10.

Save the file and exit the editor (in nano: press Ctrl+X, then Y, then Enter).

Step 3: Create the Docker Compose File

Create a file named docker-compose.yml in the same folder:

nano docker-compose.yml

Copy the following configuration into it:

name: bunny

services:
  bunny:
    image: ghcr.io/health-informatics-uon/hutch/bunny:${BUNNY_VERSION}
    container_name: bunny
    restart: unless-stopped
    depends_on:
      postgres:
        condition: service_healthy
    environment:
      DATASOURCE_DB_USERNAME: ${OMOP_DB_USERNAME}
      DATASOURCE_DB_PASSWORD: ${OMOP_DB_PASSWORD}
      DATASOURCE_DB_DATABASE: ${OMOP_DB_NAME}
      DATASOURCE_DB_DRIVERNAME: postgresql
      DATASOURCE_DB_SCHEMA: ${OMOP_DB_SCHEMA}
      DATASOURCE_DB_PORT: 5432
      DATASOURCE_DB_HOST: postgres

      TASK_API_BASE_URL: ${TASK_API_BASE_URL}
      TASK_API_USERNAME: ${TASK_API_USERNAME}
      TASK_API_PASSWORD: ${TASK_API_PASSWORD}
      TASK_API_TYPE: a
      COLLECTION_ID: ${COLLECTION_ID}
      POLLING_INTERVAL: ${POLLING_INTERVAL}

      LOW_NUMBER_SUPPRESSION_THRESHOLD: ${LOW_NUMBER_SUPPRESSION_THRESHOLD}
      ROUNDING_TARGET: ${ROUNDING_TARGET}
      OMOP_SPECIMEN_ENABLED: "true"
    networks:
      - bunny-network
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "5"

  bunny-2:
    image: ghcr.io/health-informatics-uon/hutch/bunny:${BUNNY_VERSION}
    container_name: bunny-2
    restart: unless-stopped
    depends_on:
      postgres:
        condition: service_healthy
    environment:
      DATASOURCE_DB_USERNAME: ${OMOP_DB_USERNAME}
      DATASOURCE_DB_PASSWORD: ${OMOP_DB_PASSWORD}
      DATASOURCE_DB_DATABASE: ${OMOP_DB_NAME}
      DATASOURCE_DB_DRIVERNAME: postgresql
      DATASOURCE_DB_SCHEMA: ${OMOP_DB_SCHEMA}
      DATASOURCE_DB_PORT: 5432
      DATASOURCE_DB_HOST: postgres

      TASK_API_BASE_URL: ${TASK_API_BASE_URL}
      TASK_API_USERNAME: ${TASK_API_USERNAME}
      TASK_API_PASSWORD: ${TASK_API_PASSWORD}
      TASK_API_TYPE: b
      COLLECTION_ID: ${COLLECTION_ID}
      POLLING_INTERVAL: ${POLLING_INTERVAL}

      OMOP_SPECIMEN_ENABLED: "true"
      LOW_NUMBER_SUPPRESSION_THRESHOLD: ${LOW_NUMBER_SUPPRESSION_THRESHOLD}
      ROUNDING_TARGET: ${ROUNDING_TARGET}
    networks:
      - bunny-network
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "5"

  omop-lite:
    image: ghcr.io/health-informatics-uon/omop-lite
    depends_on:
      - postgres
    environment:
      DB_HOST: postgres
      DB_USER: ${OMOP_DB_USERNAME}
      DB_PASSWORD: ${OMOP_DB_PASSWORD}
      DB_NAME: ${OMOP_DB_NAME}
    networks:
      - bunny-network
    volumes:
      - ./omop_vocab:/data

  postgres:
    image: postgres:${POSTGRES_VERSION}
    container_name: omop-db
    restart: unless-stopped
    environment:
      POSTGRES_DB: ${OMOP_DB_NAME}
      POSTGRES_USER: ${OMOP_DB_USERNAME}
      POSTGRES_PASSWORD: ${OMOP_DB_PASSWORD}
    volumes:
      - omop-postgres-data:/var/lib/postgresql/data
    ports:
      - "${POSTGRES_HOST_PORT}:5432"
    networks:
      - bunny-network
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${OMOP_DB_USERNAME} -d ${OMOP_DB_NAME}"]
      interval: 10s
      timeout: 5s
      retries: 10
      start_period: 30s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "5"

volumes:
  omop-postgres-data:

networks:
  bunny-network:
    driver: bridge

Save and close the file.

Why are there two Bunny services?

  • bunny handles type A tasks.
  • bunny-2 handles type B tasks.

The Finder sends different kinds of queries through different channels. Running two Bunny instances lets your biobank respond to all query types in parallel. Both connect to the same OMOP database.

Step 4: Start the Services

Run the following command to download the container images and start everything:

docker compose up -d

If your system uses the older standalone Docker Compose, run:

docker-compose up -d

The first startup may take a few minutes because Docker needs to download the images.

Step 5: Check That Everything Is Running

Run:

docker compose ps

You should see all services listed as running or healthy. The bunny and bunny-2 containers wait for PostgreSQL to become healthy before they start, so they may show starting for a short time.

To view the logs for Bunny:

docker logs -f bunny

Press Ctrl+C to stop following the logs. If you see repeated connection errors, double-check the values in your .env file.

Step 6: Load OMOP Vocabularies (If Needed)

If you are using the included omop-lite service and need to load OMOP vocabularies, place your vocabulary CSV files in the omop_vocab folder next to docker-compose.yml:

mkdir -p omop_vocab
# Copy your vocabulary CSV files into omop_vocab/

Then restart the omop-lite container:

docker compose restart omop-lite

If your OMOP database is managed elsewhere and already contains the vocabularies and patient data, you can skip this step.


After Installation

Once Bunny is running and connected:

  1. BBMRI-ERIC will verify the connection from the Finder side.
  2. Your biobank will become searchable in the Finder.
  3. Researchers can send queries, and Bunny will automatically process them and return privacy-protected counts.

You do not need to manually approve each query. Bunny handles them automatically based on the data in your OMOP database.


Updating Bunny

When BBMRI-ERIC notifies you of a new Bunny version:

  1. Open .env and change BUNNY_VERSION to the new version number.

  2. Pull the new image and restart the services:

    docker compose pull
    docker compose up -d
  3. Check the logs to confirm the new version started correctly:

    docker logs -f bunny

Security Notes

  • Keep your .env file secure. It contains passwords and API credentials. Do not commit it to version control (for example, do not upload it to GitHub).

  • Restrict access to the server so that only authorised people can read the project folder.

  • If you expose PostgreSQL on a public network, use a firewall to limit access to trusted IP addresses. The example above maps port 5432 to the host, so consider binding it to 127.0.0.1 only:

    ports:
      - "127.0.0.1:${POSTGRES_HOST_PORT}:5432"
  • Bunny returns only aggregated, anonymised results. Individual patient records never leave your server.


Troubleshooting

Bunny keeps restarting

Check the logs:

docker logs -f bunny

Common causes:

  • The database credentials in .env are incorrect.
  • PostgreSQL has not finished starting. Wait a minute and check again.
  • The Finder API credentials are wrong or expired. Contact BBMRI-ERIC.

PostgreSQL is not healthy

Check the PostgreSQL logs:

docker logs -f omop-db

Make sure the port 5432 (or the port you chose) is not already in use by another PostgreSQL instance on the server.

Cannot reach the Finder API

From your server, test the connection:

curl -I https://finder-dev.bbmri-eric.eu

If this fails, check your firewall, proxy settings, and DNS configuration.


Need Help?

If you get stuck at any point, contact BBMRI-ERIC:

Email: federated-platform@helpdesk.bbmri-eric.eu

Please include:

  • A description of the problem.
  • The output of docker compose ps.
  • Relevant log excerpts (remove any passwords before sharing).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors