You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These tables automatically get updated to reflect the changes as replicants
56
57
join or leave the system and thus are not designed to be manually modified
57
58
under normal circumstances. In order to keep the load evenly spread, these table
58
59
are consulted to ensure a certain fanout `physrep_fanout` is maintained across all
59
-
the nodes. The LSN information in `comdb2_physreps` table is used by all the
60
+
the nodes. The LSN (file:offset) information in `comdb2_physreps` table is used by all the
60
61
nodes to pause log-deletion.
61
62
62
63
## Algorithm
63
64
64
65
On start, a physical replicant executes `sys.physrep.register_replicant()` against
65
-
the `physrep_metadb`, which in turn, responds with a list of potential nodes that
66
+
the `physrep_metadb`, which in turn, responds with a list of potential nodes
67
+
(by doing a graph traversal on nodes (`comdb2_physreps`) and edges (`comdb2_physrep_connections`)
68
+
, starting at source as root node/tier 0, ref: `lua/lib/physrep_register_replicant.lua`) that
66
69
can be used as the source of physical logs. The replicant then picks up a node from
67
70
the list and tries to connect to it. On successful connection, the replicant executes
68
71
`sys.physrep.update_registry()` against the `physrep_metadb`, confirming that the
@@ -112,6 +115,21 @@ the hosts listed in that cluster, as represented by the bbcpu.lst.
112
115
NOTE: In cross-tier replication, the replication metadata tables must be hosted by a
113
116
separate database running in the a lower (development) tier.
114
117
118
+
### Alternate Metadbs
119
+
120
+
Physrep setup supports configuring multiple alternate metadbs in addition to the primary
121
+
`physrep_metadb`. The idea was to setup an alternate metadb in a separate tier/class (say beta) so that
122
+
production tier/class doesn't have to directly interact with lower level tiers (this is an update to the
123
+
cross-tier replication model discussed above).
124
+
125
+
Key gotchas:
126
+
* A physical replicant registers (`register_replicant`) only against the primary metadb (never an alternate).
127
+
* Metadb does not provide transaction logs, but returns candidate source nodes to replicate from (based on fanout and tree traversal, refer to [algorithm](#algorithm)).
128
+
* Alternate metadbs are primarily used by the source (physrep-parent) side to try and establish a reverse connection based on the `comdb2_physrep_sources` table.
129
+
* The source cluster writes replication metadata (entries into `comdb2_physreps`, `comdb2_physrep_connections`) to primary physrep_metadb.
130
+
* If a source is itself a physrep (tiered chain), it still uses only its primary metadb for its own registration, while reverse connecting outwards based on configured alternate metadbs.
* replicate_from dbname @host/tier: This line sets the source host/cluster. It is required for all physical replicants.
182
201
* replicate_wait <sec>: Tells the physical replicant to wait for this many seconds before applying the log records.
183
202
* physrep_metadb: If set, all the nodes will connect to this database (as against source host/cluster mentioned via `replicate_from`) for replication metadata tables
203
+
* alternate_metadb <dbname> <host>: If set, parent node will try to establish reverse connection based on the `comdb2_physrep_sources` table.
184
204
* physrep_fanout_override <dbname> <fanout>: This is set on the metadb, and allows per-database overrides of the 'physrep_fanout' tunable. The 'physrep_fanout_override' message-trap allows this to be set dynamically. The 'physrep_fanout_dump' message-trap prints the current overrides.
185
205
* physrep_ignore <tables>: All the log records that belong to any of these tables are ignored by physical replicants
186
206
* nonames: This configuration forces system database file names to not carry the database name. This setting is required for physical-log based replication to work properly.
0 commit comments