4TUResearchData · leilaicruz · Sep 15, 2025 · Sep 15, 2025
diff --git a/README.md b/README.md
@@ -0,0 +1,174 @@
+# FAIR Data Fund — Developer Overview
+
+This repository contains a small Python web application that powers the FAIR Data Fund application & review workflow. It’s a classic WSGI app built on **Werkzeug** and **Jinja2**, with data stored in / read from a **SPARQL 1.1 endpoint** via **RDFLib**.
+
+## Tech stack (at a glance)
+
+- **Language**: Python 3.8+
+- **Web**: Werkzeug (WSGI), Jinja2 templates, static assets (HTML/CSS/JS, jQuery, Quill, Dropzone)
+- **Data**: RDF and SPARQL via `rdflib` (uses `SPARQLStore`); defaults to a Virtuoso-style endpoint at `http://127.0.0.1:8890/sparql`
+- **Auth**: Optional SAML settings are supported via XML config (no external SAML lib; parsed & validated with `defusedxml`)
+- **Email**: SMTP (configurable via XML)
+- **Build tooling**: GNU Autotools (`configure.ac`, `Makefile.am`) is used to generate `pyproject.toml` and Makefiles; packaging via `setuptools`
+- **CLI entry point**: `fair-data-fund` → `fair_data_fund.ui:main`
+- **Key deps** (from `requirements.txt`): Jinja2, rdflib, requests, urllib3, Werkzeug, defusedxml
+
+## Project structure
+
+```
+fair-data-fund/
+├─ configure.ac                # Autotools: defines Python version & outputs Makefiles/pyproject
+├─ Makefile.am                 # Autotools: top-level build/dist instructions
+├─ pyproject.toml.in           # Template → becomes pyproject.toml after ./configure
+├─ requirements.txt            # Runtime dependencies
+└─ src/
+   ├─ Makefile.am              # Autotools: package layout & installables
+   └─ fair_data_fund/
+      ├─ *.py                  # Application modules (wsgi, ui, database, email, validator, rdf, ...)
+      └─ resources/
+         ├─ html_templates/    # Jinja2 templates (home, application form, review UI, etc.)
+         ├─ static/            # CSS, JS (jquery, quill, dropzone), fonts, images
+         └─ sparql_templates/  # Jinja2-templated SPARQL queries
+```
+
+### Notable modules
+
+- `fair_data_fund.wsgi`: WSGI app; routing via `werkzeug.routing.Map/Rule`; static files via `SharedDataMiddleware`.
+- `fair_data_fund.ui`: CLI entry; starts the dev server with `werkzeug.serving`. Handles `--config-file` and `--initialize`.
+- `fair_data_fund.database`: SPARQL client built on `rdflib.plugins.stores.sparqlstore.SPARQLStore`. Uses Jinja2-rendered SPARQL templates. Default endpoint: `http://127.0.0.1:8890/sparql`. SPARQL Update can be configured separately.
+- `fair_data_fund.validator`, `formatter`, `email_handler`, `rdf`, `cache`, `convenience`: support logic.
+
+## Prerequisites
+
+- Python **3.8+**
+- A **SPARQL 1.1 endpoint** with update enabled (e.g. Virtuoso at `:8890` or Jena Fuseki). You can run one locally using Docker; set the endpoint URLs in the XML config below.
+- (Optional) SMTP account for outgoing email notifications.
+- (Optional) SAML Identity Provider metadata if you want SSO in dev.
+
+## Local setup (fast path for development)
+
+You can run the app directly from source without a full Autotools build:
+
+```bash
+# 1) Create and activate a virtual environment
+python3 -m venv .venv
+source .venv/bin/activate
+
+# 2) Install Python dependencies
+pip install --upgrade pip
+pip install -r requirements.txt
+
+# 3) Create a config file (see example below), then start the server
+python -m fair_data_fund.ui --config-file ./dev-config.xml --initialize
+# or, after installation via setuptools (see "Autotools build" below):
+# fair-data-fund --config-file ./dev-config.xml --initialize
+```
+
+This starts a dev server (defaults to `127.0.0.1:8080`). The `--initialize` flag will populate the RDF store with default triples (institutions, etc.).
+
+> **Tip:** If you run from source, `python` needs to see the `src/` tree. From the repo root, `python -m fair_data_fund.ui` works because Python will pick up `src/` automatically when installed in editable mode. If you hit `ModuleNotFoundError`, run `pip install -e .` after generating `pyproject.toml` (see below), or set `PYTHONPATH=src`.
+
+## Minimal config (XML)
+
+Create a file `dev-config.xml` in the repo root with something like:
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<fair-data-fund>
+  <!-- Web server -->
+  <address>127.0.0.1</address>
+  <port>8080</port>
+  <base-url>http://127.0.0.1:8080</base-url>
+  <debug-mode>1</debug-mode>
+  <live-reload>1</live-reload>
+
+  <!-- Caching (optional) -->
+  <cache-root clear-on-start="1">.cache</cache-root>
+
+  <!-- Storage & RDF/SPARQL -->
+  <storage-root>.fdf_storage</storage-root>
+  <rdf-store>
+    <state-graph>default://graph</state-graph>
+    <sparql-uri>http://127.0.0.1:8890/sparql</sparql-uri>
+    <sparql-update-uri>http://127.0.0.1:8890/sparql</sparql-update-uri>
+  </rdf-store>
+
+  <!-- Email (optional) -->
+  <email>
+    <server>smtp.example.com</server>
+    <port>587</port>
+    <username>user</username>
+    <password>pass</password>
+    <from>[email protected]</from>
+    <subject-prefix>[FDF]</subject-prefix>
+    <starttls>1</starttls>
+  </email>
+
+  <!-- SAML SSO (optional) -->
+  <saml>
+    <entity-id>https://fdf.local/</entity-id>
+    <certificate-file>/path/to/sp.pem</certificate-file>
+    <private-key-file>/path/to/sp.key</private-key-file>
+
+    <identity-provider>
+      <entity-id>https://idp.example.org/</entity-id>
+      <x509-certificate>...PEM...</x509-certificate>
+      <single-signon-service>
+        <url>https://idp.example.org/sso</url>
+        <binding>urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect</binding>
+      </single-signon-service>
+    </identity-provider>
+  </saml>
+</fair-data-fund>
+```
+
+Only the SPARQL settings are required for a functional dev run; email & SAML are optional.
+
+## Running a local SPARQL endpoint (example)
+
+**Virtuoso (quick start):**
+```bash
+docker run -it --name virtuoso -p 8890:8890 -e DBA_PASSWORD=dba openlink/virtuoso-opensource-7:latest
+# After it starts, ensure SPARQL Update is enabled and use the endpoints in dev-config.xml.
+```
+
+**Jena Fuseki (alternative):**
+```bash
+docker run -it --name fuseki -p 3030:3030 stain/jena-fuseki
+# Create a dataset in the UI, then set:
+#   <sparql-uri>http://127.0.0.1:3030/your-dataset/sparql</sparql-uri>
+#   <sparql-update-uri>http://127.0.0.1:3030/your-dataset/update</sparql-update-uri>
+```
+
+## Autotools build (optional, for packaging)
+
+This repo uses GNU Autotools to generate the packaging metadata (`pyproject.toml`) and Makefiles. If you want to build & install the package the “classic” way, do:
+
+```bash
+# Install autotools if needed (autoconf, automake, libtool)
+autoreconf -i         # generates ./configure and friends
+./configure           # writes Makefile(s) and pyproject.toml from pyproject.toml.in
+make                  # builds dist artifacts (and assembles extra resources)
+pip install .         # installs the package (exposes the 'fair-data-fund' CLI)
+# (Optional) make distcheck / make install
+```
+
+Once installed:
+```bash
+fair-data-fund --config-file ./dev-config.xml --initialize
+```
+
+## Common issues
+
+- **SPARQL endpoint not reachable**: The app will log failures if it cannot connect. Check `rdf-store` URIs, port mappings, and that SPARQL Update is enabled.
+- **Missing fonts/static**: Static is served by the WSGI app via `SharedDataMiddleware`; ensure you’re running from an installed layout or from the source tree so resource paths resolve.
+- **SAML misconfiguration**: Start without SAML in dev; add it later with correct IdP metadata.
+- **Email sending fails**: Verify SMTP credentials; set `<starttls>1</starttls>` only if your SMTP server supports it.
+
+## License
+
+AGPL-3.0-or-later (see `LICENSE` if included in your dist).
+
+---
+
+_This README was generated from the uploaded repository contents to give you a working starting point._
diff --git a/src/fair_data_fund/resources/html_templates/application-form.html b/src/fair_data_fund/resources/html_templates/application-form.html
@@ -50,7 +50,14 @@ <h2>Basic information</h2>
 <label for="datatype">Data types handled in the project</label><div class="fas fa-question-circle help-icon"><span class="help-text">For example: Quantitive data, qualitative data, research software.</span></div>
 <input type="text" id="datatype" name="datatype" value="{{application.datatype}}" />
 
-<label for="size">What is the volume (in gigabytes) of the total dataset?</label>&nbsp;<span class="required-field">&#8727;</span>
+<label for="size">
+  What is the volume (in gigabytes) of the total dataset? 
+  <br>
+  <small style="font-weight: normal; color: #555;">
+    💡 Every user has an initial storage space of 10GB assigned.  
+    If you want to upload a larger dataset, you can request more storage space via the repository website.
+  </small>
+</label>&nbsp;<span class="required-field">&#8727;</span>
 <input type="text" id="size" name="size" value="{{application.size}}" />
 
 <label>Briefly describe the content of the data and file formats before obtaining funding</label><div class="fas fa-question-circle help-icon"><span class="help-text">Include a description of the data (raw, processed, analyzed) and research software as these are currently available. In later questions, you can elaborate on how the FAIR principles will be implemented. </span></div>