-
Notifications
You must be signed in to change notification settings - Fork 3.9k
xds: ORCA to LRS propagation changes #12203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to look at some details more closely, but it is mostly just plumbing.
@@ -25,6 +25,7 @@ | |||
import com.google.common.collect.Sets; | |||
import io.grpc.Internal; | |||
import io.grpc.Status; | |||
import io.grpc.xds.BackendMetricPropagation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.grpc.xds.client can't depend on io.grpc.xds. We moved client into its own package so it could be used without the rest of grpc.
@@ -420,6 +421,29 @@ public void run() { | |||
return loadCounter; | |||
} | |||
|
|||
@Override | |||
public LoadStatsManager2.ClusterLocalityStats addClusterLocalityStats( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the old method just call this one with backendMetricPropagation
set to null
? (Feel free to do that in XdsClient.java)
Map<String, Struct> filterMetadata, @Nullable BackendMetricPropagation backendMetricPropagation, | ||
@Nullable OutlierDetection outlierDetection, Object endpointLbConfig, | ||
LoadBalancerRegistry lbRegistry, Map<String, | ||
Map<Locality, Integer>> prioritizedLocalityWeights, List<DropOverload> dropOverloads) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we add backendMetricPropagation param to the end of the methods for better consistency ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will push it after outlierDetection
since all the arguments taken from ClusterState are together and then others.
if (memUtilization > 0) { | ||
boolean shouldPropagate = true; | ||
if (backendMetricPropagation != null) { | ||
shouldPropagate = backendMetricPropagation.propagateMemUtilization; | ||
} | ||
|
||
if (shouldPropagate) { | ||
String metricName = "mem_utilization"; | ||
if (!loadMetricStatsMap.containsKey(metricName)) { | ||
loadMetricStatsMap.put(metricName, new BackendLoadMetricStats(1, memUtilization)); | ||
} else { | ||
loadMetricStatsMap.get(metricName).addMetricValueAndIncrementRequestsFinished(memUtilization); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be extracted out to a separate function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
*/ | ||
public synchronized void recordBackendLoadMetricStats(Map<String, Double> namedMetrics) { | ||
// If no propagation configuration is set, use the old behavior (propagate everything) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done only when the feature is not enabled. If the feature is enabled, only when the * is specified for named_metrics we should propate everything.
Prefixing "named_metrics" should also happen only when the feature is enabled.
Also in recordTopLevelMetrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be handled by the current code. The methods in BackendMetricPropagation
are implemented in such a way to take care of these cases.
However I can now see the case where feature is enabled but no backendMetricPropagation
config is available then it creates problem. Best is to check if the feature is enabled or not instead of checking null on backendMetricPropagation
. I'll refactor and make it more clear paths for normal path and feature-enabled path.
I think recordTopLevelMetrics
works fine I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs work on unit tests.
*/ | ||
@Override | ||
public void onLoadReport(MetricReport report) { | ||
stats.recordTopLevelMetrics( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems better to move the feature guard check from inside stats.recordTopLevelMetrics to here, for more clarity.
@@ -398,7 +398,7 @@ public void dynamicCluster() { | |||
ClusterResolverConfig childLbConfig = (ClusterResolverConfig) childBalancer.config; | |||
assertThat(childLbConfig.discoveryMechanism).isEqualTo( | |||
DiscoveryMechanism.forEds( | |||
clusterName, EDS_SERVICE_NAME, null, null, null, Collections.emptyMap(), null)); | |||
clusterName, EDS_SERVICE_NAME, null, null, null, Collections.emptyMap(), null, null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should set a backendMetricPropagation in ClusterResource and assert that it is present in the discovery mechanism in the child LB config.
@@ -140,16 +140,16 @@ public class ClusterResolverLoadBalancerTest { | |||
FailurePercentageEjection.create(100, 100, 100, 100)); | |||
private final DiscoveryMechanism edsDiscoveryMechanism1 = | |||
DiscoveryMechanism.forEds(CLUSTER1, EDS_SERVICE_NAME1, LRS_SERVER_INFO, 100L, tlsContext, | |||
Collections.emptyMap(), null); | |||
Collections.emptyMap(), null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pass a backendMetricsPropagation to acceptResolvedAddresses and assert that it ends up being in the DiscoveryMechanism for both Eds and Logical Dns cluster types.
@@ -1241,8 +1242,9 @@ public ClusterDropStats addClusterDropStats( | |||
@Override | |||
public ClusterLocalityStats addClusterLocalityStats( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should assert for top level metrics updated when feature is enabled (and vice versa) in @test recordLoadStats
.
@@ -98,13 +101,20 @@ private synchronized void releaseClusterDropCounter( | |||
|
|||
/** | |||
* Gets or creates the stats object for recording loads for the specified locality (in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests to cover the new changes in LoadStatsManager2Test.
syncContext.execute(new Runnable() { | ||
@Override | ||
public void run() { | ||
serverLrsClientMap.get(serverInfo).startLoadReporting(); | ||
} | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Remove empty line.
Implements gRFC A85 (grpc/proposal#454).