Skip to content

Conversation

@fantiq
Copy link
Contributor

@fantiq fantiq commented Aug 19, 2025

What is the purpose of the change?

resolve #15623 , to support the options check=false for registry-center config-center and metadata-center. There contains the content of PR #15594. The specific changes are as follows:

config-center

  1. When use the registry-center address as config-center or metadata-center adddress, there should assign the value of option registry.check to the config-center or metadata-center config option.

related change files:

  • org.apache.dubbo.config.deploy.DefaultApplicationDeployer
  1. Add method boolean isAvailable() on DynamicConfiguration, to check the connection status.

related change files:

  • org.apache.dubbo.common.config.configcenter.DynamicConfiguration
  • org.apache.dubbo.configcenter.support.nacos.NacosDynamicConfiguration
  • org.apache.dubbo.configcenter.support.zookeeper.ZookeeperDynamicConfiguration
  • org.apache.dubbo.common.config.configcenter.wrapper.CompositeDynamicConfiguration

3. Cache the operation of register config item listener when the connection is not available.

related change files:

- org.apache.dubbo.common.config.configcenter.wrapper.FailbackDynamicConfiguration
- org.apache.dubbo.configcenter.support.nacos.NacosDynamicConfigurationFactory
- org.apache.dubbo.configcenter.support.zookeeper.ZookeeperDynamicConfigurationFactory

the config-center item listener is support async add.

service-name-mapping

  1. Refectory the reporting process of ServiceNameMapping, let it support failure retries.

related change files:

  • org.apache.dubbo.config.ServiceConfig
  • org.apache.dubbo.metadata.AbstractServiceNameMapping
  • org.apache.dubbo.metadata.ServiceNameMapping

metadata

  1. Add method boolean isAvailable() on MetadataReport, to check the connection status.

related change files:

  • org.apache.dubbo.metadata.report.MetadataReport
  • org.apache.dubbo.metadata.report.support.AbstractMetadataReport
  • org.apache.dubbo.metadata.store.zookeeper.ZookeeperMetadataReport
  • org.apache.dubbo.metadata.store.nacos.NacosMetadataReport
  1. Support the option of check=false

related change files:

  • org.apache.dubbo.metadata.report.support.AbstractMetadataReportFactory
  • org.apache.dubbo.metadata.store.zookeeper.ZookeeperMetadataReport
  • org.apache.dubbo.metadata.store.nacos.NacosMetadataReport

service-discovery

  1. Add field boolean reported to record whether the metadata reporting is successful or not, Add field boolean registered to record where the service-instance register is successful or not.

related change files:

  • org.apache.dubbo.metadata.MetadataInfo
  • org.apache.dubbo.registry.client.ServiceInstance
  • org.apache.dubbo.registry.client.DefaultServiceInstance
  1. Add failure retry logic for metadata reporting and service-instance registration.

related change files:

  • org.apache.dubbo.registry.client.AbstractServiceDiscovery

consumer

Support the consumer check option, if it is false, the subscribe fail will retry by FailbackRegistry.

related change files:

  • org.apache.dubbo.registry.client.ServiceDiscoveryRegistry

Checklist

  • Make sure there is a GitHub_issue field for the change.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test to verify your logic correction. If the new feature or significant change is committed, please remember to add sample in dubbo samples project.
  • Make sure gitHub actions can pass. Why the workflow is failing and how to fix it?

@codecov-commenter
Copy link

codecov-commenter commented Aug 19, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 86 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.05%. Comparing base (a7b641f) to head (4042caf).

Files with missing lines Patch % Lines
...ubbo/config/deploy/DefaultApplicationDeployer.java 12.50% 7 Missing and 7 partials ⚠️
...dubbo/metadata/report/MetadataReportRetryTask.java 75.47% 9 Missing and 4 partials ⚠️
...che/dubbo/metadata/AbstractServiceNameMapping.java 81.48% 8 Missing and 2 partials ⚠️
...ubbo/metadata/store/nacos/NacosMetadataReport.java 33.33% 4 Missing and 2 partials ⚠️
...ubbo/registry/client/AbstractServiceDiscovery.java 76.92% 2 Missing and 4 partials ⚠️
...ubbo/registry/client/ServiceDiscoveryRegistry.java 25.00% 6 Missing ⚠️
.../report/support/AbstractMetadataReportFactory.java 25.00% 2 Missing and 1 partial ⚠️
...ubbo/registry/nacos/NacosNamingServiceWrapper.java 76.92% 2 Missing and 1 partial ⚠️
...he/dubbo/registry/nacos/NacosServiceDiscovery.java 50.00% 0 Missing and 3 partials ⚠️
.../apache/dubbo/config/bootstrap/DubboBootstrap.java 0.00% 2 Missing ⚠️
... and 13 more
Additional details and impacted files
@@             Coverage Diff              @@
##                3.3   #15639      +/-   ##
============================================
+ Coverage     61.02%   61.05%   +0.02%     
- Complexity    11685    11715      +30     
============================================
  Files          1923     1924       +1     
  Lines         87081    87223     +142     
  Branches      13115    13136      +21     
============================================
+ Hits          53141    53252     +111     
- Misses        28488    28491       +3     
- Partials       5452     5480      +28     
Flag Coverage Δ
integration-tests-java21 32.95% <43.79%> (-0.04%) ⬇️
integration-tests-java8 32.99% <43.79%> (+0.04%) ⬆️
samples-tests-java21 32.63% <34.88%> (+0.01%) ⬆️
samples-tests-java8 30.35% <32.55%> (-0.01%) ⬇️
unit-tests-java11 58.99% <63.17%> (+<0.01%) ⬆️
unit-tests-java17 58.76% <63.17%> (-0.01%) ⬇️
unit-tests-java21 58.74% <63.17%> (-0.02%) ⬇️
unit-tests-java8 58.99% <63.17%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@fantiq
Copy link
Contributor Author

fantiq commented Aug 25, 2025

@RainYuY @zrlw @AlbumenJ PTAL

@zrlw zrlw requested a review from Copilot August 25, 2025 08:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for the check=false option across config-center, metadata-center, and service discovery components in Dubbo, allowing these services to continue operating when their registry connections are unavailable. The changes also add failure retry mechanisms for metadata reporting and service registration.

  • Adds isAvailable() methods to check connection status for DynamicConfiguration and MetadataReport interfaces
  • Implements retry logic for service name mapping and metadata reporting when connections fail
  • Propagates registry check parameter to config-center and metadata-center configurations

Reviewed Changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
dubbo-common/src/main/java/org/apache/dubbo/common/utils/UrlUtils.java Adds utility method to check if connection checking is enabled
dubbo-config/dubbo-config-api/src/main/java/org/apache/dubbo/config/deploy/DefaultApplicationDeployer.java Propagates registry check option to config-center and metadata-center
dubbo-configcenter/dubbo-configcenter-/src/main/java/org/apache/dubbo/configcenter/support//NacosDynamicConfiguration.java Implements connection availability checking for Nacos config center
dubbo-metadata/dubbo-metadata-api/src/main/java/org/apache/dubbo/metadata/AbstractServiceNameMapping.java Adds retry mechanism for service name mapping registration
dubbo-registry/dubbo-registry-api/src/main/java/org/apache/dubbo/registry/client/ServiceDiscoveryRegistry.java Adds failure retry support for consumer subscriptions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 94 to 95
e);
}
Copy link

Copilot AI Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The nested try-catch with retry logic inside the do-while loop creates complex control flow. Consider extracting this retry logic into a separate method for better readability and maintainability.

Suggested change
e);
}
e);
return false;
}
}
/**
* Helper method to perform mapping with retry logic.
*/
private boolean doMappingWithRetry(MetadataReport metadataReport, URL url) throws InterruptedException {
boolean succeeded = false;
int currentRetryTimes = 1;
do {
succeeded = super.doMapping(metadataReport, url);
if (succeeded) {
logger.info(
"[METADATA_REGISTER] [SERVICE_NAME_MAPPING] Successfully registered interface application mapping for service "
+ url.getServiceKey());
break;
} else {
int waitTime = ThreadLocalRandom.current().nextInt(casRetryWaitTime);
logger.info("Failed to publish service name mapping to metadata center by cas operation. "
+ "Times: " + currentRetryTimes + ". "
+ "Next retry delay: " + waitTime + ". "
+ "Service Interface: " + url.getServiceInterface() + ". ");
Thread.sleep(waitTime);
}
} while (currentRetryTimes++ <= casRetryTimes);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the logic of the implementation class, it overridden method boolean doMapping(MetadataReport metadataReport, URL url).

This comment was marked as outdated.

@zrlw zrlw requested a review from Copilot August 26, 2025 01:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for the check=false configuration option for config-center and metadata-center components in Dubbo. The main purpose is to allow these components to handle connection failures gracefully when check is disabled, implementing retry mechanisms and fail-safe behaviors similar to how registry components already handle the check option.

Key changes include:

  • Added isAvailable() methods to check connection status for DynamicConfiguration and MetadataReport interfaces
  • Implemented fail-safe behavior when check=false for ZooKeeper and Nacos implementations
  • Added retry mechanisms for service name mapping and metadata reporting
  • Enhanced service discovery to support connection availability checks

Reviewed Changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
dubbo-common/src/main/java/org/apache/dubbo/common/utils/UrlUtils.java Added utility method to check URL check parameter
dubbo-remoting/dubbo-remoting-zookeeper-curator5/src/main/java/org/apache/dubbo/remoting/zookeeper/curator5/Curator5ZookeeperClient.java Updated to respect check parameter for ZooKeeper connections
dubbo-configcenter/dubbo-configcenter-zookeeper/src/main/java/org/apache/dubbo/configcenter/support/zookeeper/ZookeeperDynamicConfiguration.java Added availability check and removed mandatory connection validation
dubbo-metadata/dubbo-metadata-api/src/main/java/org/apache/dubbo/metadata/AbstractServiceNameMapping.java Added retry mechanism for service name mapping
dubbo-config/dubbo-config-api/src/main/java/org/apache/dubbo/config/deploy/DefaultApplicationDeployer.java Updated to inherit check parameter from registry to config/metadata centers

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

throw new IllegalStateException(e);
CONFIG_FAILED_INIT_CONFIG_CENTER, "", "", "The configuration center failed to initialize");
if (!configCenter.isCheck()) {
configCenter.setInitialized(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. initialized was already set to true at configCenter.checkOrUpdateInitialized(true).
  2. you might miss the process when dynamicConfiguration.isAvailable() return false and configCenter.isCheck() returns true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. yes, the configCenter.setInitialized(true) is unnecessary, i'll delete it later.
  2. if check = true, the create process will always throw IllegalStateException.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            try {
                dynamicConfiguration = getDynamicConfiguration(configCenter.toUrl());
                if (!dynamicConfiguration.isAvailable()) {
                    logger.warn(
                            CONFIG_FAILED_INIT_CONFIG_CENTER, "", "", "The configuration center failed to initialize");
                    if (!configCenter.isCheck()) {
                        configCenter.setInitialized(true);
                        // TODO should it `updateExternalConfigMap` when the connection recovery
                        return dynamicConfiguration;
                    }
                }
            } catch (Exception e) {
                throw new IllegalStateException(e);
            }

it seemed you don't throw exception when check is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When executing dynamicConfiguration = getDynamicConfiguration(configCenter.toUrl());, SPI DynamicConfigurationFactory will create DynamicConfiguration. Currently the impl nacos zookeeper and apollo will throw IllegalStateException if the check = true and the connection is unavailable.

Do you mean here need throw exception when check=true and connection is not available, to prevent situations where this logic is not implemented in other extensions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't know customer extensions whether throw exception or not, it might be better to prevent such situations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Right, it was fixed and also the metadata-report client.

@zrlw zrlw added the type/enhancement Everything related with code enhancement or performance label Aug 29, 2025
}
if (!dynamicConfiguration.isAvailable() && !configCenter.isCheck()) {
logger.warn(
"The configuration center initialize successfully. but connection is available now, and the config-center.check is false, it will return now.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but connection is available now,

unavailable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it was a mistake here. I have corrected it. Could you please take another look?

@fantiq

This comment was marked as abuse.

@fantiq
Copy link
Contributor Author

fantiq commented Sep 10, 2025

@zrlw Does this PR contain too many changes that make code review inconvenient? Would it be better for me to split these changes into multiple PRs?

@zrlw
Copy link
Contributor

zrlw commented Sep 11, 2025

you should fix your error code inspect failure issue first.
open the failure link at https://github.com/apache/dubbo/actions/runs/17537062451/job/50106226140?pr=15639,
or open your local build-and-test-pr.yml and run its Run Error Code Inspecting on your local machine to find the reason and correct your codes.

@fantiq
Copy link
Contributor Author

fantiq commented Sep 11, 2025

you should fix your error code inspect failure issue first. open the failure link at https://github.com/apache/dubbo/actions/runs/17537062451/job/50106226140?pr=15639, or open your local build-and-test-pr.yml and run its Run Error Code Inspecting on your local machine to find the reason and correct your codes.

Hey, it was fixed. @zrlw

@zrlw zrlw self-requested a review September 11, 2025 12:54
zrlw
zrlw previously approved these changes Sep 11, 2025
Copy link
Contributor

@zrlw zrlw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zrlw zrlw requested a review from Copilot September 11, 2025 12:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 42 out of 42 changed files in this pull request and generated 5 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +170 to +178
DefaultServiceInstance newServiceInstance =
new DefaultServiceInstance((DefaultServiceInstance) serviceInstance);
newServiceInstance
.getMetadata()
.put(
EXPORTED_SERVICES_REVISION_PROPERTY_NAME,
newServiceInstance.getServiceMetadata().getRevision());
doRegister(newServiceInstance);
this.serviceInstance = newServiceInstance;
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The metadata update logic is duplicated between the register() and update() methods. Consider extracting this metadata setting logic into a helper method to reduce code duplication.

Copilot uses AI. Check for mistakes.
.map(InstanceInfo::getInstance)
.collect(Collectors.toList());

accept(() -> namingService.batchRegisterInstance(nacosServiceName, group, instanceListToRegister));
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance list modification should be done atomically after the successful operation, not during the batch registration process. Moving these operations after the successful batch registration would ensure consistency in case of failures.

Suggested change
accept(() -> namingService.batchRegisterInstance(nacosServiceName, group, instanceListToRegister));
accept(() -> namingService.batchRegisterInstance(nacosServiceName, group, instanceListToRegister));
// Only modify the instance list after successful batch registration

Copilot uses AI. Check for mistakes.
for (URL url : urlSet) {
try {
if (!retryHandler.apply(metadataReport, url)) {
throw new Exception("method doMap() return false");
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'method doMap() return false' is unclear and not helpful for debugging. Consider providing a more descriptive message that explains what failed and why.

Suggested change
throw new Exception("method doMap() return false");
throw new Exception("Retry handler failed for service url: " + url + ", metadata-center url: " + metadataReport.getUrl());

Copilot uses AI. Check for mistakes.
Comment on lines 400 to 401
return !newRevision.equals(metadataInfo.getReportedRevision())
|| !newRevision.equals(instance.getMetadata(EXPORTED_SERVICES_REVISION_PROPERTY_NAME));
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This complex boolean expression should be simplified or broken down into separate boolean variables with descriptive names to improve readability and maintainability.

Suggested change
return !newRevision.equals(metadataInfo.getReportedRevision())
|| !newRevision.equals(instance.getMetadata(EXPORTED_SERVICES_REVISION_PROPERTY_NAME));
boolean isReportedRevisionDifferent = !newRevision.equals(metadataInfo.getReportedRevision());
boolean isExportedServicesRevisionDifferent = !newRevision.equals(instance.getMetadata(EXPORTED_SERVICES_REVISION_PROPERTY_NAME));
return isReportedRevisionDifferent || isExportedServicesRevisionDifferent;

Copilot uses AI. Check for mistakes.
applicationModel.getApplicationConfigManager().getApplication();
if (application.isPresent()) {
enableFileCache = Boolean.TRUE.equals(application.get().getEnableFileCache()) ? true : false;
enableFileCache = Boolean.TRUE.equals(application.get().getEnableFileCache());
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This boolean expression can be simplified. The ternary operator and explicit true/false values are unnecessary since Boolean.TRUE.equals() already returns a boolean.

Copilot uses AI. Check for mistakes.
@fantiq
Copy link
Contributor Author

fantiq commented Sep 16, 2025

@RainYuY Hi, Could you please take some time to review this PR?

zrlw
zrlw previously approved these changes Sep 17, 2025
@RainYuY
Copy link
Contributor

RainYuY commented Sep 19, 2025

@RainYuY Hi, Could you please take some time to review this PR?

The next week.

@fantiq
Copy link
Contributor Author

fantiq commented Sep 19, 2025

@RainYuY Hi, Could you please take some time to review this PR?

The next week.

thanks for the update

}

public static boolean isCheck(URL url) {
return url.getParameter(CHECK_KEY, true) && url.getPort() != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why check url.getPort() != 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why check url.getPort() != 0?

It just kept the original logic, maybe it can be removed.

.getBean(FrameworkExecutorRepository.class)
.getSharedScheduledExecutor();
mapServiceName(url, serviceNameMapping, scheduledExecutor);
mapServiceName(url, serviceNameMapping);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove scheduledExecutor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new AbortPolicyWithReport(threadName, url));

zkClient = zookeeperClientManager.connect(url);
boolean isConnected = zkClient.isConnected();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this deletion possibly affect prepareEnvironment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the check = false and connection is unavailable, the client will add to cache, but will not assign the externalConfiguration and appExternalConfiguration.
Is it necessary to assign the externalConfiguration and appExternalConfiguration when connection recovery?

@RainYuY
Copy link
Contributor

RainYuY commented Oct 1, 2025

Before I do a detailed review of this PR, I’d like to ask you to reopen PR #15594
, because we’re preparing a release soon. I’d prefer to fix the bug first and review the larger PR afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/enhancement Everything related with code enhancement or performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Enable the ConfigCenter And MetadataCenter to support option of check=false

4 participants