Add CostAwareModel mixin and extract_baseline_plan for cost regression testing#212
Merged
homatthew merged 4 commits intoNetflix-Skunkworks:mainfrom Jan 29, 2026
Merged
Conversation
d938889 to
00d86c9
Compare
3 tasks
00d86c9 to
7dcbe3c
Compare
f8b2f69 to
e1fff86
Compare
7dcbe3c to
c4bd20c
Compare
homatthew
commented
Jan 22, 2026
b5e24fb to
53299d8
Compare
41b41ea to
4bdd3a2
Compare
rayiniv-nflx
approved these changes
Jan 28, 2026
7e0c88e to
ba6c001
Compare
…n testing This commit introduces infrastructure for tracking model cost changes: - CostAwareModel mixin: Provides `current_cluster_cost()` and `proposed_cluster_cost()` methods for models to compute costs - extract_baseline_plan(): Extracts baseline cluster information from existing capacity plans for cost comparison - cluster_infra_cost(): Helper to calculate infrastructure costs - Cost methods added to: Cassandra, EVCache, Kafka, KeyValue, StatelessJava - New test_cost_regression.py for validating cost calculations - Use class variables for cluster_type/service_name instead of hardcoded strings Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ost bug - Fix drive cost bug: remove is_zonal check so regional clusters with drives (e.g., Java app EBS root volumes) get drive costs included - Clean up register_model: use Dict[str, str] for O(1) cluster_type duplicate detection instead of O(n) nested loop - Remove unused requirement parameter from service_costs interface: all implementations explicitly ignored it (_ = requirement) - Simplify extract_baseline_plan: remove fake requirement computation block (~30 lines) that no model actually used - Fold service costs into costs dict inside _get_model_costs instead of redundant loop after the call - Remove unused imports: BufferComponent, buffer_for_components, CapacityRequirement (from models/__init__.py and key_value.py) - Keep cluster_type field on CurrentClusterCapacity: needed for composite models (key-value) to preserve types for sub-model filtering Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…_plan Key changes: - Make ClusterCapacity.annual_cost a computed property (count * instance + drives) - Add annual_cost_override for models with non-standard pricing (Aurora shared storage) - Rename _convert_current_clusters to _extract_cluster_plan for clarity - Require cluster_type for baseline extraction (validate instead of default) - Use certain_int() in tests instead of verbose Interval() - Simplify cassandra.py network_services call Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ba6c001 to
da1ba9e
Compare
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
abersnaze
approved these changes
Jan 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What am I trying to do?
Enable automated detection of unintended cost calculation changes when capacity models are modified.
Why did I do it this way?
cluster_costs()for infrastructure,service_costs()for workload-dependent costs_sub_models()DFS: Same traversal asplan_certain()for consistencyAlgorithm
Architecture
graph LR A[Current Deployment] --> B[extract_baseline_plan] B --> C[DFS via _sub_models] C --> D[Each model: cluster_costs + service_costs] D --> E[Aggregated CapacityPlan] E --> F[Compare to baseline]Tests
Regression tests verify costs stay stable. See
tests/netflix/test_cost_regression.py.🤖 Generated with Claude Code