Skip to content

Commit f35df3d

Browse files
Improve the page on the incremental import (#1294) (#1324)
Co-authored-by: Reneta Popova <[email protected]>
1 parent f4aa0f0 commit f35df3d

File tree

1 file changed

+43
-25
lines changed

1 file changed

+43
-25
lines changed

modules/ROOT/pages/tools/neo4j-admin/neo4j-admin-import.adoc

Lines changed: 43 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -408,10 +408,28 @@ If importing to a database that has not explicitly been created prior to the imp
408408

409409
label:enterprise-only[]
410410

411-
When the initial data load cannot be completed in a single full import, incremental import allows the operation to be completed as a series of smaller imports.
411+
Incremental import allows you to incorporate large amounts of data in batches into the graph.
412+
You can run this operation as part of the initial data load when it cannot be completed in a single full import.
413+
Besides, you can update your graph by importing data incrementally, which is more performant than transactional insertion of such data.
412414

413-
Incremental import requires the use of `--force` because it must only be run on databases in precisely the expected state and as part of an initial load.
414-
When an incremental import fails, it can leave the data corrupted.
415+
Incremental import requires the use of `--force` and can be run on an existing database only.
416+
417+
You must stop your database, if you want to perform the incremental import within one command.
418+
419+
If you cannot afford a full downtime of your database, split the operation into several stages:
420+
421+
* _prepare_ stage (offline)
422+
* _build_ stage (offline or read-only)
423+
* _merge_ stage (offline)
424+
425+
The database must be stopped for the `prepare` and `merge` stages.
426+
During the `build` stage, the database can be left online but put into read-only mode.
427+
For a detailed example, see <<incremental-import-stages>>.
428+
429+
[WARNING]
430+
====
431+
It is highly recommended to back up your database before running the incremental import, as if the _merge_ stage fails, is aborted, or crashes, it may corrupt the database.
432+
====
415433

416434
[[import-tool-incremental-syntax]]
417435
=== Syntax
@@ -446,6 +464,7 @@ The incremental import command can be used to add:
446464
[WARNING]
447465
====
448466
Note that you must have node property uniqueness constraints in place for the property key and label combinations that form the primary key, or the uniquely identifiable nodes.
467+
Otherwise, the command will throw an error and exit.
449468
For more information, see <<import-tool-header-format>>.
450469
====
451470
* New relationships between existing or new nodes.
@@ -692,30 +711,13 @@ performance, this value should not be greater than the number of available proce
692711
[[import-tool-incremental-examples]]
693712
=== Examples
694713

695-
There are two ways of importing data incrementally:
696-
697-
* If downtime is not a concern, you can run a single command with the option `--stage=all`.
698-
This option requires the database to be stopped.
699-
* If you cannot afford a full downtime of your database, you can run the import in three stages:
714+
There are two ways of importing data incrementally.
700715

701-
** _prepare_ stage:
702-
+
703-
During this stage, the import tool analyzes the CSV headers and copies the relevant data over to the new increment database path.
704-
The import command is run with the option `--stage=prepare` and the database must be stopped.
716+
==== Incremental import in a single command
705717

706-
** _build_ stage:
707-
+
708-
During this stage, the import tool imports the data into the database.
709-
This is the longest stage and you can put the database in read-only mode to allow read access.
710-
The import command is run with the option `--stage=build`.
711-
712-
** _merge_ stage:
713-
+
714-
During this stage, the import tool merges the new with the existing data in the database.
715-
It also updates the affected indexes and upholds the affected property uniqueness constraints and property existence constraints.
716-
The import command is run with the option `--stage=merge` and the database must be stopped.
718+
If downtime is not a concern, you can run a single command with the option `--stage=all`.
719+
This option requires the database to be stopped.
717720

718-
.Incremental import in a single command
719721
====
720722
[source, shell, role=noplay]
721723
----
@@ -725,9 +727,17 @@ $ bin/neo4j-admin database import incremental --stage=all --nodes=N1=../../raw-d
725727
----
726728
====
727729

728-
.Incremental import in stages
730+
[[incremental-import-stages]]
731+
==== Incremental import in stages
732+
733+
If you cannot afford a full downtime of your database, you can run the import in three stages.
734+
729735
====
730736
. `prepare` stage:
737+
+
738+
During this stage, the import tool analyzes the CSV headers and copies the relevant data over to the new increment database path.
739+
The import command is run with the option `--stage=prepare` and the database must be stopped.
740+
+
731741
.. Stop the database with the `WAIT` option to ensure a checkpoint happens before you run the incremental import command.
732742
The database must be stopped to run `--stage=prepare`.
733743
+
@@ -742,6 +752,11 @@ neo4j@system> STOP DATABASE db1 WAIT;
742752
$ bin/neo4j-admin database import incremental --stage=prepare --nodes=N1=../../raw-data/incremental-import/c.csv db1
743753
----
744754
. `build` stage:
755+
+
756+
During this stage, the import tool imports the data, deduplicates it, and validates it in the new increment database path.
757+
This is the longest stage and you can put the database in read-only mode to allow read access.
758+
The import command is run with the option `--stage=build`.
759+
+
745760
.. Put the database in read-only mode:
746761
+
747762
[source, shell, role=noplay]
@@ -756,6 +771,9 @@ $ bin/neo4j-admin database import incremental --stage=build --nodes=N1=../../raw
756771
----
757772
. `merge` stage:
758773
+
774+
During this stage, the import tool merges the new with the existing data in the database.
775+
It also updates the affected indexes and upholds the affected property uniqueness constraints and property existence constraints.
776+
The import command is run with the option `--stage=merge` and the database must be stopped.
759777
It is not necessary to include the `--nodes` or `--relationships` options when using `--stage=merge`.
760778
+
761779
.. Stop the database with the `WAIT` option to ensure a checkpoint happens before you run the incremental import command.

0 commit comments

Comments
 (0)