Skip to content
This repository was archived by the owner on Apr 2, 2026. It is now read-only.

geometalab/hop-plugin-ogr2ogr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

142 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Hop-Plugin OGR Vector Import/Export

Archived Repository

This repository has been archived as it is superseded by Stefan Ziegler's hop-gdal-plugin, which provides a more integrated and native GDAL/OGR solution for Apache Hop using Java 22's Foreign Function & Memory API. The new solution eliminates the need for separate GDAL/OGR installation and provides superior streaming capabilities and geometry support. See Stefan's blog posts Let's Hop #1 and Let's Hop #5 for more details.

Logo of the plugin

Description

This plugin allows you to utilise the ogr2ogr CLI tool of the GDAL library inside Apache Hop as both a Transform and an Action. This plugin was developed and tested for Apache Hop version 2.16.

Dialog window of the plugin

Pipeline Status Latest Release

Installation

Prerequisite: Make sure you have GDAL including ogr2ogr installed on the machine that will run your Hop instance. GDAL installation

Download the two ZIP files from the latest release, extract it, and then move the entire extracted folder, along with all its contents, into your hop/plugins/transforms or hop/actions/transforms directory as appropriate.

release zip

For the Transform:

hop
└── plugins
    └── transforms
        └── ogr2ogr
            ├── hop-transform-ogr2ogr-version.jar
            └── version.xml

For the Action:

hop
└── plugins
    └── actions
        └── ogr2ogr
            ├── hop-action-ogr2ogr-version.jar
            └── version.xml

Manual Building

Pre-requisites for building the project:

  • Maven, version 3+
  • Java JDK 11
  • (IntelliJ)

Download the source code and checkout into the folder.

git clone https://gitlab.ost.ch/apache-hop-plugin-sa/apache-hop-plugin-ogr2ogr.git
cd apache-hop-plugin-ogr2ogr

Configure the plugin installation path:

In both the action pom.xml and the transform pom.xml, set the <hop.plugins.dir> property to the target directory inside your Hop installation where the plugin should be installed.
For example:

  • Action pom.xml: <hop.plugins.dir>C:\Users\user\Program Files\hop\plugins\actions\ogr2ogr</hop.plugins.dir>
  • Transform pom.xml: <hop.plugins.dir>C:\Users\user\Program Files\hop\plugins\transforms\ogr2ogr</hop.plugins.dir> Doing this will automatically update the plugin in Hop after each build.

Build the plugin:

Running mvn clean packagein the root directory of the repository, cleans, builds, tests, and installs/updates the plugin.

To only build a specific module of the Plugin, the Run Configuration for both modules has to be set to -pl {module} -am package.

Run Configuration for modules

Updating

To update the plugin, simply repeat the steps of the installation with the ZIP file from the new version. Make sure to overwrite existing files.

Usage

Setting Description
Input Type Defines whether the input is a fixed file or read from a field in the incoming data stream.
Output Type Defines whether the output is written to a file or the file path is written to a field.
File Input Uses one explicitly configured input file for the conversion.
Field Input Reads the input file path from a field produced by a previous transform.
File Output Writes the result to a single output file without passing the path to the pipeline.
Field Output Writes the absolute output file path to a specified output field.
Format Specifies the ogr2ogr output driver based on locally installed GDAL drivers.
File Extension Defines the output file extension, required when using field output.
Behaviour Controls how existing output files are handled (e.g., overwrite or append).
Additional Options Allows passing custom command-line options directly to ogr2ogr.

As the plugin is based on the ogr2ogr CLI, performance may become memory-bound when processing large files.

Example Pipeline

Sample Pipeline Sample Pipeline

Example Action

Sample Workflow Sample Workflow

Please notice that the behavior has been set to 'Overwrite', as otherwise ogr2ogr would fail when running the workflow multiple times.

Code Formatting

The code follows the official Google Java Style Guide, which is enforced by the provided IntelliJ formatting plugin and the Maven Spotless plugin. Formatting compliance can be verified using mvn spotless:check, and formatting can be applied automatically using mvn spotless:apply.

Dependencies and Plugins

The following dependencies and Plugins were used throughout this project:

CI/CD Pipeline

A basic CI/CD pipeline is defined inside the gitlab yml file, with the following stages and jobs:

build

  • check_formatting:
    Checks formatting via the mvn spotless:check command.
  • build:
    Builds the project via the mvn clean package command and moves all resulting artefacts into their respective targets/ directory.

test

  • code_quality:
    Run the code quality template job, performing a basic code review, as a prerequisite for merge requests.
  • integration_test_transform:
    Run the Hop Pipeline inside integration_test/ that performs a basic OGR conversion to test the integrity of the Transform.
    In case the type name of the Transform or the metadata changes, the Pipeline has to be manually adjusted.
  • integration_test_action:
    Run the Hop Workflow inside integration_test/ that performs a basic OGR conversion to test the integrity of the Action.
    In case the type name of the Action or the metadata changes, the Workflow has to be manually adjusted.
  • semgrep-sast:
    Run the static application security testing (SAST) template job, performing a security scan on the repository.
  • speed_test_transform:
    Run a speed test Hop Pipeline inside speed_test/, whose performance metrics get logged via the log_pipeline.hpl, in a JSON file
  • speed_test_action:
    Run a speed test Hop Workflow inside speed_test/, whose performance metrics get logged via the log_pipeline.hpl, in a JSON file
  • evaluate_performance_transform:
    Extracts the duration of the speed_test Pipeline out of the prior JSON file and evaluates the resulting performance via a simple bash script.
  • evaluate_performance_action:
    Extracts the duration of the speed_test Workflow out of the prior JSON file and evaluates the resulting performance via a simple bash script.

release

  • prepare_release_transform:
    This job only runs once a new tag is created. It creates a ZIP archive containing all necessary files for the Transform plugin. This ZIP file is made available in a release.
  • prepare_release_action:
    This job only runs once a new tag is created. It creates a ZIP archive containing all necessary files for the Action plugin. This ZIP file is made available in a release.

License

This project is licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at:

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Apache Hop-Plugin OGR Vector Import-Export

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors