-
Notifications
You must be signed in to change notification settings - Fork 46
Open
Labels
Description
Dataset Information:
A Dataset for Image Information Retrieval in European Portuguese. The data is sourced from the Portuguese Presidency website. It contains 4,678 articles, 42,333 images, and 80 queries related to the Portuguese Presidency. Over 5,000 images were annotated by three annotators.
Links to Resources:
Presidency website: https://www.presidencia.pt/
GitHub containing dataset files: https://github.com/LIAAD/pt-image-ir-dataset
Dataset ID(s) & supported entities:
- pt-image-ir-dataset
Checklist
Mark each task once completed. All should be checked prior to merging a new dataset.
- Dataset definition (in
ir_datasets/datasets/[topid].py
) - Tests (in
tests/integration/[topid].py
) - Metadata generated (using
ir_datasets generate_metadata
command, should appear inir_datasets/etc/metadata.json
) - Documentation (in
ir_datasets/etc/[topid].yaml
)- Documentation generated in https://github.com/seanmacavaney/ir-datasets.com/
- Downloadable content (in
ir_datasets/etc/downloads.json
)- Download verification action (in
.github/workflows/verify_downloads.yml
). Only one needed pertopid
.
- Download verification action (in
Additional comments/concerns/ideas/etc.