The JvmRunner executes processors implemented in the Java Virtual Machine (JVM).
It allows you to integrate custom Java (or Kotlin, Scala, etc.) processors into an RDF-Connect streaming pipeline by providing a JAR and a class name.
- Runner type:
rdfc:JvmRunner(imported automatically). - Processor definition: Each processor must declare a JAR and a fully qualified Java class via
rdfc:javaImplementationOf. - Implementation requirement: All processors extend the abstract class
io.github.rdfc.Processor<T>and provide lifecycle methods. - Packaging requirement: Processors must include a descriptor RDF file (e.g.,
index.ttl) inside their JAR. - Distribution option: You can publish your processor JAR with JitPack so it can be included as a dependency in pipelines.
To use your JVM processor in a pipeline:
- Import the JvmRunner:
<> owl:imports <https://javadoc.jitpack.io/com/github/rdf-connect/jvm-runner/runner/master-SNAPSHOT/runner-master-SNAPSHOT-index.jar>.- Link your processor to the runner:
@prefix rdfc: <https://w3id.org/rdf-connect#>.
<> a rdfc:Pipeline;
rdfc:consistsOf [
rdfc:instantiates rdfc:JvmRunner;
rdfc:processor <myProcessor>;
].
<myProcessor> a rdfc:TestProcessor.Make sure you also import the processor. Check the documentation of your processor on how to install it.
Processors must:
- Extend the abstract class
io.github.rdfc.Processor<T>whereTis anArgsclass containing configuration fields. - Implement lifecycle methods: The callbacks should be called to indicate that the function is finished.
init(Consumer<Void> callback)— initialization.transform(Consumer<Void> callback)— processing of inputs from readers, called for each processor before produce.produce(Consumer<Void> callback)— producing data, useful for processor like a file reader.
- Define an
Argsclass with fields matching RDF properties defined in the SHACL shape.
The processor should be accompanied by a description file, often called index.ttl.
They require the following fields:
rdfc:javaImplementationOfwith valuerdfc:Processor, indicating that this processor is a JavaProcessor,rdfc:jarpointing to the resulting jar, often<>pointing to the current jarrdfc:classthe fully qualified name of the processor- A SHACL shape defining the required arguments.
For example, the following description file declares a processor with arguments { reader: Reader, writer: Writer, additionalText: string }.
A matching implementation can be found on GitHub.
@prefix rdfc: <https://w3id.org/rdf-connect#>.
@prefix sh: <http://www.w3.org/ns/shacl#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdfc:TestProcessor rdfc:javaImplementationOf rdfc:Processor;
rdfc:class "org.example.Library";
rdfc:jar <file:./build/libs/my-processor-all.jar>.
[] a sh:NodeShape;
sh:targetClass rdfc:TestProcessor;
sh:property [
sh:path rdfc:reader;
sh:name "reader";
sh:minCount 1;
sh:maxCount 1;
sh:class rdfc:Reader;
], [
sh:path rdfc:writer;
sh:name "writer";
sh:minCount 1;
sh:maxCount 1;
sh:class rdfc:Writer;
], [
sh:path rdfc:additionalText;
sh:name "additionalText";
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string;
].To build a JVM processor for use with the JvmRunner:
Your build.gradle should include:
plugins {
id 'java'
id 'com.github.johnrengelman.shadow' version '8.1.1'
id 'maven-publish'
}
repositories {
mavenCentral()
maven { url = 'https://jitpack.io' }
}
dependencies {
implementation 'com.google.protobuf:protobuf-java:4.28.2'
implementation 'com.github.rdf-connect.jvm-runner:types:master-SNAPSHOT'
}Use the Shadow plugin to produce a fat JAR that includes your processor and its descriptor:
tasks.named("shadowJar", Jar) {
// add your processor descriptor (e.g., index.ttl) to the root of the jar
from("index.ttl") {
into("")
}
}The fat jar is built with gradle shadowJar
To make your processor available as a dependency from GitHub via JitPack, add the following to your build.gradle.
publishing {
publications {
maven(MavenPublication) {
// publish the fat JAR
artifact(tasks.shadowJar)
}
}
}Then:
- Push your code to GitHub
- Users can then include your processor as a dependency like this:
repositories {
mavenCentral()
maven { url = 'https://jitpack.io' }
}
dependencies {
implementation 'com.github.<your-github-user>:<your-repo>:master-SNAPSHOT' // or a git hash or release
}- Args class fields must align with RDF properties defined in the SHACL shape.
- Descriptor file (e.g., index.ttl) must be packaged in the JAR.
- Fat JAR packaging ensures no dependency issues when running.
- Publishing with JitPack allows others to use your processor directly via GitHub.