Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 8, 2026

Implements autonomous agent to load RDF files into the knowledge graph as nanopublications. Resources typed as whyis:RDFFile are automatically processed with proper provenance tracking and retirement on type removal.

Implementation

Agent (whyis/autonomic/rdf_file_loader.py)

  • Inherits from UpdateChangeService for nanopub-based provenance
  • Processes whyis:RDFFilewhyis:LoadedRDFFile with whyis:RDFFileLoadingActivity
  • Format detection via file extensions and content-type headers (Turtle, RDF/XML, JSON-LD, N-Triples, N3, TriG, N-Quads)

Source Support

  1. File depot: Uses whyis:hasFileID property following SETLr pattern
  2. HTTP/HTTPS: Direct URL loading with content negotiation
  3. S3: boto3 import on-demand with graceful ImportError if unavailable

Vocabulary (whyis/default_vocab.ttl)

  • whyis:RDFFile - Input class
  • whyis:LoadedRDFFile - Output class (subclass of RDFFile)
  • whyis:RDFFileLoadingActivity - Provenance activity

Usage

# Configuration
INFERENCERS = {
    'RDFFileLoader': autonomic.RDFFileLoader(),
}

# Mark resources for loading
<http://example.com/local-file> a whyis:RDFFile ;
    whyis:hasFileID "depot_id" .

<http://example.com/data.ttl> a whyis:RDFFile .

<s3://bucket/data.rdf> a whyis:RDFFile .

Testing

26 unit tests with mocked HTTP, S3, and file depot access covering all source types, format detection, and error handling.

Original prompt

Create an agent that looks for files of type whyis:RDFFile and loads them into the graph via the nanopublication_manager and attaches appropriate provenance to them so that if the type designation is removed, the resulting graph(s) are also retired. This agent needs to support the following scenarios:

  1. check to see if the file is a locally uploaded resource in the file depot. Use that copy if so. Be sure to use channels that ensure that the importer infrastructure is properly triggered. There's a handler in the SETLr agent that illustrates this.

  2. If the file is remote, support access via http/https as well as s3 URI schemes, using boto3 on-demand, including built-in credential usage. Be sure to fail safely if boto3 isn't installed.

Provide unit tests for all scenarios, using boto3 mocks as needed.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 4 commits January 8, 2026 23:12
Co-authored-by: jpmccu <602385+jpmccu@users.noreply.github.com>
…s passing

Co-authored-by: jpmccu <602385+jpmccu@users.noreply.github.com>
Co-authored-by: jpmccu <602385+jpmccu@users.noreply.github.com>
Co-authored-by: jpmccu <602385+jpmccu@users.noreply.github.com>
Copilot AI changed the title [WIP] Add agent to load RDF files into graph with provenance handling Add RDF file loader agent with file depot, HTTP/HTTPS, and S3 support Jan 8, 2026
Copilot AI requested a review from jpmccu January 8, 2026 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants