Skip to content

Commit 8644767

Browse files
authored
Update README.md
1 parent 66065c8 commit 8644767

File tree

1 file changed

+34
-1
lines changed

1 file changed

+34
-1
lines changed

README.md

+34-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# xml-to-postgres
22
A fast tool to convert XML files with repeating element sets into PostgreSQL dump format.
33

4-
To use this tool you need to create a simple YAML configuration file that describes how to turn repeating element sets in an XML document into row-based data for importing into PostgreSQL. For efficiency, the data is output in PostgreSQL dump format, suitable for importing with the COPY command. This tool processes one row at a time and does not need to keep the whole XML DOM in memory, so it has a very low memory footprint and can be used to convert datasets much larger than the available RAM. The tool can split out further repeating fields into extra tables with a one-to-many relationship (with foreign key) to the main table.
4+
To use this tool you need to create a simple YAML configuration file that describes how to turn repeating element sets in an XML document into row-based data for importing into PostgreSQL. For efficiency, the data is output in PostgreSQL dump format, suitable for importing with the COPY command. This tool processes one row at a time and does not need to keep the whole XML DOM in memory, so it has a very low memory footprint and can be used to convert datasets larger than the available RAM. The tool can split out further repeating fields into extra tables with a one-to-many relationship (with foreign key) to the main table.
55

66
## Features
77

@@ -11,3 +11,36 @@ To use this tool you need to create a simple YAML configuration file that descri
1111
* Apply search-and-replace on values
1212
* Filter the output with regex
1313
* Write extra tables with a foreign key to the main table
14+
* Operate in a pipeline to avoid on-disk intermediary steps
15+
16+
## Compiling
17+
18+
This project uses the Rust 2021 Edition, which means you need at a minimum to have the Rust 1.56 toolchain installed. The project uses only stable features and will only add dependencies that can compile on stable. It's a normal Rust project managed by Cargo, so you can compile with this simple command:
19+
20+
cargo build --release
21+
22+
The debug build really hurts performance, so unless you're doing a deep dive in the code it is recommended to compile for release.
23+
24+
## Running
25+
26+
Basic usage:
27+
28+
xml-to-postgres <config.yml> [data.xml]
29+
30+
So the YAML configuration file is a required argument. The XML input file can be passed in as the second argument or can be sent to stdin if omitted.
31+
32+
Example invocation:
33+
34+
xml-to-postgres config.yml data.xml > data.dump
35+
36+
Within a pipeline:
37+
38+
unzip -p xml.zip | xml-to-postgres config.yml | psql <database> -c '\copy <table> from stdin'
39+
40+
Within a database transaction:
41+
42+
xml-to-postgres config.yml data.xml | psql <database> -c 'BEGIN' -c 'TRUNCATE <table>' -c '\copy <table> from stdin' -c 'COMMIT'
43+
44+
## Configuration
45+
46+
See [documentation for the configuration file](https://github.com/bartnv/xml-to-postgres/wiki/Configuration-options) in the wiki.

0 commit comments

Comments
 (0)