Skip to content

Operator Extraction

Tobias John edited this page Oct 29, 2025 · 2 revisions

To come up with mutation operators, RDFMutate provides functionality to extract mutation operators from references KGs that are from the domain of interest. For this functionality, RDFMutate uses RDFRules to first mine association rules from the reference KGs and later extracts mutation operators from the association rules. The mutation operators are saved as an RDF graph using the SWRL representation and can then be loaded to generate mutations (see Specification of Mutation Configuration).

Idea

We propose a two-stage process to extract mutation operators from reference KGs. First, we mine association rules from the KGs. Secondly, we extract mutation operators from the mined association rules.

The goal of rule mining is to mine Horn rules that describe the structure of the RDF triples. A Horn rule has the form B1 ∧ ... ∧ Bn → H where B1 ∧ ... ∧ Bn is the body of the rule and H is the head of the rule. All atoms B1,..., Bn, and H are triples, which might contain variables. The length of such a rule is n+1. The intuition for the rule is that if a set of triples in the KG matches the triples in the rule body, there must be a triple in the KG matching the head triple.

Each obtained rule describes one pattern that occurs frequently in the reference KGs. We derive mutation operators that change KGs while preserving these patterns. For each rule, we derive two types of mutation operators: operators that add this pattern to the KG and operators that remove this pattern.

Usage

Run the application as follows:

java -jar rdfmutate.jar --operator-extraction --config=<configuration-file>

All information about how to run the extraction are specified using as yaml-configuration file. The following example shows all elements of the configuration file

jar_location: rules/build/libs/rules-1.0-all.jar

kg_files:
  - src/test/resources/ruleExtraction/ore_ont_155.owl

output:
  file: src/test/resources/ruleExtraction/temp.tt
  type: rdf
  overwrite: true

parameters:
  min_rule_match: 50
  min_head_match: 20
  min_confidence: 0.8
  max_rule_length: 3
  timeout: 60
  • jar_location: contains a string specifying, where to find the jar for the subproject rules. This jar is included in the releases or, if you build RDFMutate locally, can be found in the folder rules/build/libs/. This splitting in the project is necessary to solve dependencies conflicts between RDFMutate and RDFRules.
  • kg_files: contains a list of strings, that represent paths to the files containing the KGs that the operators should be extracted from.
  • output: contains all information on how to save the generated mutation operators as a file. file is the location of the file, type can be rdf or owl, depending on whether the turtle syntax or the owl functional syntax is used (rdf is the preferred option when the operators are later imported to generate mutants). If overwrite is set to true, the file is replaced if the file already exists.
  • parameters contains a list of parameters for the association rule (Horn rule) mining. The association rules are mined separately for each reference KG, i.e., the numbers of occurrence are considered per reference graph and not overall:
    • min_rule_match: how often the association rule is satisfied in the reference KG
    • min_head_match: how often the head of the association rule is contained in the reference KG
    • min_confidence: minimum share of matches for the association rule body where also the head matches. I.e., 0.8 => in 80% of the matches of the rule body, the rule is correct and there is a matching head.
    • max_rule_length: maximum length of the association rule
    • timeout: timeout for mining the association rules (in seconds)

Clone this wiki locally