Small Express.js API in TypeScript that scrapes the public UA canteen CMS and exposes normalized menu data.
- Fetches
https://cms.ua.pt/ementas/ementas - Parses canteen tables into a typed API
- Normalizes lunch/dinner rows, weekend split rows, empty rows, and
Encerradoentries - Repairs some malformed CMS headers, including broken years when they can be inferred
- Caches the full scrape in memory for 10 minutes by default
- Node.js
>= 22 - npm
npm install
cp .env.example .env
npm run devThe API starts on http://localhost:3000 by default.
Build the image:
docker build -t ementas-cms .Run the container:
docker run --rm -p 3000:3000 ementas-cmsWith custom env values:
docker run --rm -p 3000:3000 \
-e LOG_LEVEL=debug \
-e CACHE_TTL_MS=300000 \
-e STALE_CACHE_MAX_AGE_MS=21600000 \
ementas-cmsUse Docker Compose with a local build:
docker-compose up --buildRun it in the background:
docker-compose up --build -dStop it:
docker-compose downThe repository includes a GitHub Actions workflow at
/.github/workflows/publish-docker-image.yml
that publishes the Docker image to GHCR on pushes to main, version tags like v1.0.0,
and manual runs.
Published image name:
ghcr.io/guilhermevieiradev/ementas-cms-ua
Notes:
- The workflow uses the repository
GITHUB_TOKEN, following GitHub's recommended GHCR flow. - The first published package may need its visibility changed in GitHub if you want it public.
PORT=3000
LOG_LEVEL=info
CACHE_TTL_MS=600000
STALE_CACHE_MAX_AGE_MS=21600000
HTTP_PROXY=
HTTPS_PROXY=
NO_PROXY=If your environment needs an outbound proxy to reach the UA CMS, set
HTTP_PROXY or HTTPS_PROXY. The app configures the global fetch dispatcher
from those variables, and docker-compose.yml passes them through to both the
image build and the running container.
npm run dev
npm run build
npm run start
npm run typecheck
npm run lintReturns server health plus current cache state.
Returns stable canteen identifiers:
{
"canteens": [
{ "id": "crasto", "name": "Crasto" },
{ "id": "grelhados", "name": "Grelhados" },
{ "id": "estga", "name": "ESTGA" },
{ "id": "restaurante-vegetariano", "name": "Restaurante Vegetariano" },
{ "id": "tresde", "name": "TrêsDê" }
]
}Query params:
from=YYYY-MM-DDto=YYYY-MM-DDcanteens=crasto,estgaincludeAnomalies=true
If no dates are sent, the API defaults to today in Europe/Lisbon.
Example:
curl "http://localhost:3000/api/v1/menus?from=2026-04-07&to=2026-04-10&canteens=crasto,estga"Example response shape:
{
"meta": {
"sourceUrl": "https://cms.ua.pt/ementas/ementas",
"fetchedAt": "2026-04-07T12:00:00.000Z",
"requestedRange": {
"from": "2026-04-07",
"to": "2026-04-10"
},
"availableRange": {
"from": "2026-04-07",
"to": "2026-05-11"
},
"timezone": "Europe/Lisbon",
"cached": true,
"stale": false,
"anomalyCount": 3
},
"canteens": [],
"anomalies": []
}Key normalized enums:
MealService:lunch | dinner | unknownMealStatus:available | closed | emptyMenuItemCategory:soup | meat | fish | diet | vegetarian | other
Each menu item keeps:
categorysourceLabeltext
The API intentionally does not split items into a name and description because the CMS content is inconsistent.
The current scraper relies on:
div.view-content table.tabelahead.views-tablecaptiontd.views-field-titletd.views-field-body
Known CMS issues handled by the parser:
- incorrect weekday labels
- malformed years such as
08/04/206 - double slashes such as
05/05//2026 - weekend rows with both lunch and dinner in one body
- empty body rows
Encerradorows- logical line breaks encoded with
<br>inside a paragraph