Skip to content

Add CostAwareModel mixin and extract_baseline_plan for cost regression testing#212

Merged
homatthew merged 4 commits intoNetflix-Skunkworks:mainfrom
homatthew:mho/cost-routing-new
Jan 29, 2026
Merged

Add CostAwareModel mixin and extract_baseline_plan for cost regression testing#212
homatthew merged 4 commits intoNetflix-Skunkworks:mainfrom
homatthew:mho/cost-routing-new

Conversation

@homatthew
Copy link
Contributor

@homatthew homatthew commented Jan 21, 2026

What am I trying to do?

Enable automated detection of unintended cost calculation changes when capacity models are modified.

Why did I do it this way?

  • Mixin pattern: Not all models implement cost methods yet; allows incremental adoption
  • Two cost methods: cluster_costs() for infrastructure, service_costs() for workload-dependent costs
  • Reuses _sub_models() DFS: Same traversal as plan_certain() for consistency

Algorithm

extract_baseline_plan(model_name, desires):
    # 1. Convert CurrentClusters → ClusterCapacity (price instances + drives)
    # 2. DFS cycle check (error if model composes with itself)
    # 3. DFS traverse via _sub_models(), for each model:
    #    - Require CostAwareModel (error if missing)
    #    - Call cluster_costs() + service_costs()
    # 4. Return CapacityPlan with aggregated costs

Architecture

graph LR
    A[Current Deployment] --> B[extract_baseline_plan]
    B --> C[DFS via _sub_models]
    C --> D[Each model: cluster_costs + service_costs]
    D --> E[Aggregated CapacityPlan]
    E --> F[Compare to baseline]
Loading

Tests

Regression tests verify costs stay stable. See tests/netflix/test_cost_regression.py.

🤖 Generated with Claude Code

@homatthew homatthew force-pushed the mho/cost-routing-new branch from d938889 to 00d86c9 Compare January 21, 2026 23:37
@homatthew homatthew changed the title Add extract_baseline_plan and cluster-type routing Add extract_baseline_plan for Cassandra, EVCache, Kafka, Key-Value, Stateless Java Jan 21, 2026
@homatthew homatthew force-pushed the mho/cost-routing-new branch from 00d86c9 to 7dcbe3c Compare January 21, 2026 23:46
@homatthew homatthew force-pushed the mho/cost-baseline-and-methods branch from f8b2f69 to e1fff86 Compare January 21, 2026 23:54
@homatthew homatthew force-pushed the mho/cost-routing-new branch from 7dcbe3c to c4bd20c Compare January 21, 2026 23:54
@homatthew homatthew changed the base branch from mho/cost-baseline-and-methods to main January 23, 2026 00:03
@homatthew homatthew force-pushed the mho/cost-routing-new branch 3 times, most recently from b5e24fb to 53299d8 Compare January 26, 2026 21:10
@homatthew homatthew changed the title Add extract_baseline_plan for Cassandra, EVCache, Kafka, Key-Value, Stateless Java Always include drive cost when attached (fixes cost gap) Jan 26, 2026
@homatthew homatthew changed the title Always include drive cost when attached (fixes cost gap) Add CostAwareModel mixin and extract_baseline_plan for cost regression testing Jan 26, 2026
@homatthew homatthew force-pushed the mho/cost-routing-new branch 6 times, most recently from 41b41ea to 4bdd3a2 Compare January 28, 2026 19:21
@homatthew homatthew force-pushed the mho/cost-routing-new branch 6 times, most recently from 7e0c88e to ba6c001 Compare January 29, 2026 03:04
homatthew and others added 3 commits January 28, 2026 22:06
…n testing

This commit introduces infrastructure for tracking model cost changes:

- CostAwareModel mixin: Provides `current_cluster_cost()` and
  `proposed_cluster_cost()` methods for models to compute costs
- extract_baseline_plan(): Extracts baseline cluster information from
  existing capacity plans for cost comparison
- cluster_infra_cost(): Helper to calculate infrastructure costs
- Cost methods added to: Cassandra, EVCache, Kafka, KeyValue, StatelessJava
- New test_cost_regression.py for validating cost calculations
- Use class variables for cluster_type/service_name instead of hardcoded strings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ost bug

- Fix drive cost bug: remove is_zonal check so regional clusters with
  drives (e.g., Java app EBS root volumes) get drive costs included

- Clean up register_model: use Dict[str, str] for O(1) cluster_type
  duplicate detection instead of O(n) nested loop

- Remove unused requirement parameter from service_costs interface:
  all implementations explicitly ignored it (_ = requirement)

- Simplify extract_baseline_plan: remove fake requirement computation
  block (~30 lines) that no model actually used

- Fold service costs into costs dict inside _get_model_costs instead
  of redundant loop after the call

- Remove unused imports: BufferComponent, buffer_for_components,
  CapacityRequirement (from models/__init__.py and key_value.py)

- Keep cluster_type field on CurrentClusterCapacity: needed for
  composite models (key-value) to preserve types for sub-model filtering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…_plan

Key changes:
- Make ClusterCapacity.annual_cost a computed property (count * instance + drives)
- Add annual_cost_override for models with non-standard pricing (Aurora shared storage)
- Rename _convert_current_clusters to _extract_cluster_plan for clarity
- Require cluster_type for baseline extraction (validate instead of default)
- Use certain_int() in tests instead of verbose Interval()
- Simplify cassandra.py network_services call

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@homatthew homatthew force-pushed the mho/cost-routing-new branch from ba6c001 to da1ba9e Compare January 29, 2026 06:04
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@homatthew homatthew merged commit 09f457a into Netflix-Skunkworks:main Jan 29, 2026
4 checks passed
@homatthew homatthew deleted the mho/cost-routing-new branch February 18, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments