Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Unable to replicate data to target milvus ("the collection is not in the watch list") #172

Open
ckad-gr opened this issue Jan 16, 2025 · 4 comments

Comments

@ckad-gr
Copy link

ckad-gr commented Jan 16, 2025

Current Behavior

No replication takes place between source and target milvus.

Expected Behavior

Any partition/collection being created and data added to them in source milvus should be replicated in target milvus.

Steps To Reproduce

  1. Start target milvus with common.ttMsgEnabled: false
  2. Start CDC with following config.

cdc-config.txt

  1. Create a task in CDC like so:

curl -X POST http://localhost:8444/cdc -H "Content-Type: application/json" -d '{ "request_type": "create", "request_data": { "milvus_connect_param": { "host": "redacted-project.redacted-project-2", "port": 19530, "enable_tls": true, "username": "admin", "password": "PASSWORD", "connect_timeout": 10, "dial_config": { "server_name": "redacted-project.redacted-project-2", "server_pem_path": "/tmp/cert.pem" } }, "collection_infos": [ { "name": "*" } ] } }'

  1. Create new partition in source milvus. Insert data into the new partition source milvus via attu. ("cdc_database" > "cdc_collection" > "cdc_partition")

  2. Try pausing and resuming the CDC task just to check if that fixes anything.

Environment

Milvus: v2.4.17
Milvus-cdc: v2.0.0-rc4
compiled from git project, running inside docker image: docker.io/milvusdb/milvus-cdc@sha256:593da341ba7282583a9fed3f92f95d6a42e275231f7b6762f9524470e29f0568
With pulsar and etcd.

CDC is connected to the etcd and pulsar of source milvus.
CDC is using the etcd of source milvus under a different rootPath.

Anything else?

Logs:
logs.txt

The log line "the collection is not in the watch list" seems interesting.
I wonder if I'm misconfiguring things such that CDC is skipping all collections.

Note: "test" "actual_data" collection/db were deleted before starting this test.

@ckad-gr
Copy link
Author

ckad-gr commented Jan 16, 2025

I've also tried using

    "collection_infos": [
      {
        "name": "cdc_collection"
      }
    ]

apart from just "name": "*"

@SimFG
Copy link
Collaborator

SimFG commented Jan 17, 2025

Because by default, collection info only synchronizes collections under the default db. You can try to create a collection under the default db.

@ckad-gr
Copy link
Author

ckad-gr commented Jan 18, 2025

@SimFG thanks for the reply. Is there a way to synchronise collections under other databases yet ?

@SimFG
Copy link
Collaborator

SimFG commented Jan 20, 2025

It is currently supported, but the documentation has not been updated yet. You can try the following request first:

{
    "request_type": "create",
    "request_data": {
        "milvus_connect_param": {
            "host": "127.0.0.1",
            "port": 19500
        },
        "db_collections": {
            "cdc_database": [
                {
                    "name": "cdc_collection"
                }
            ]
        }
    }
}

This will synchronize the cdc_collection under cdc_database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants