LongEval 2025

This repository instruction about the Collection (French original Language) for the LongEval 2025 Information Retrieval Challenge.
The dataset supports training and evaluation in Information Retrieval (IR) tasks.

🔗 Official Challenge Website: LongEval 2025

📂 Folder Structure

The dataset is structured as follows:

release_2025/
│-- French/
│   │-- LongEval Train Collection/   # Training data (first release)
│   │   │-- Json/                    # JSON-formatted document collection
│   │   │-- qrels/                   # Relevance judgments (Qrels), divided by month
│   │   │   │-- 2022-06_fr/          # Qrels for June 2022
│   │   │   │   ├── qrels_processed.txt  # Processed Qrels file
│   │   │   │-- 2022-07_fr/          # Qrels for July 2022
│   │   │   │-- ...                  # Other months
│   │   │-- Trec/                    # TREC-formatted document collection, divided by month
│   │   │   │-- 2022-06_fr/          # Documents for June 2022
│   │   │   │-- 2022-07_fr/          # Documents for July 2022
│   │   │   │-- ...                  # Other months
│   │-- LongEval Test Collection/    # Test data (to be released later)
│   │-- collection_db.db             # Database mapping document IDs to URLs (Train + Test data)
│   │-- queries_db.db                # Database mapping query IDs to query texts
│   │-- queries.txt                  # Plain text file with queries and IDs
│   │-- inspect_db.py                # Script to inspect the document database
│   │-- statistics_collection.py     # Script to generate dataset statistics

📖 Dataset Description

1️⃣ LongEval Train Collection

Json/: Document collection in JSON format.
Trec/: Document collection in TREC format.
qrels/: Relevance judgments mapping queries to relevant documents, organized by month.

2️⃣ LongEval Test Collection

The test data will be released at a later stage.

3️⃣ Database Files

collection_db.db: SQLite database mapping document IDs to their URLs (Train + Test data).
queries_db.db: SQLite database linking query IDs to query texts.
queries.txt: A plain text file listing all queries.

🛠 How to Use `inspect_db.py`

The inspect_db.py script allows participants to inspect the document collection database (collection_db.db).
It can retrieve information such as:

The URL of a document.
The last update timestamp of a document.
The month(s) in which the document appears.

▶️ Usage Example

To check the details of a document by its ID, run:

python inspect_db.py collection_db.db --id <DOCUMENT_ID>

Example output:

Tables in the database: [('mapping',), ('sqlite_sequence',)]

Structure of table 'mapping':
(0, 'id', 'INTEGER', 0, None, 1)
(1, 'url', 'TEXT', 0, None, 0)
(2, 'last_updated_at', 'TEXT', 0, None, 0)
(3, 'date', 'TEXT', 0, None, 0)

Record with ID 1:
{
    "id": 1,
    "url": "https://www.blogduvoyage.fr/roadtrip-usa-conseils/",
    "last_updated_at": [1640160479, 1640160479],
    "date": ["2022-06", "2022-06"]
}

📌 About the LongEval Challenge

This dataset is part of the LongEval 2025 challenge, designed to evaluate longitudinal information retrieval over time.
Participants will work with time-based IR tasks, analyzing evolving document collections and queries.

📢 More details: LongEval 2025 Challenge

Name		Name	Last commit message	Last commit date
Latest commit History 324 Commits
2024		2024
2025		2025
assets		assets
collection		collection
data		data
dates		dates
directions		directions
flyer		flyer
includes		includes
organizers		organizers
program		program
registration		registration
submissions		submissions
tasks		tasks
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html
screenshot.png		screenshot.png
sitemap.xml		sitemap.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LongEval 2025

📂 Folder Structure

📖 Dataset Description

1️⃣ LongEval Train Collection

2️⃣ LongEval Test Collection

3️⃣ Database Files

🛠 How to Use `inspect_db.py`

▶️ Usage Example

📌 About the LongEval Challenge

About

Uh oh!

Releases

Packages

Contributors 9

Uh oh!

Languages

License

clef-longeval/clef-longeval.github.io

Folders and files

Latest commit

History

Repository files navigation

LongEval 2025

📂 Folder Structure

📖 Dataset Description

1️⃣ LongEval Train Collection

2️⃣ LongEval Test Collection

3️⃣ Database Files

🛠 How to Use inspect_db.py

▶️ Usage Example

📌 About the LongEval Challenge

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Uh oh!

Languages

🛠 How to Use `inspect_db.py`

Packages