Skip to content

Conversation

@jimmysway
Copy link

  • Add load_outages_from_nerc_rates() to fetch outages from nerc-rates repository
  • Update calculate_storage_gb_hours to auto-load outages for "stack" and "ocp-prod" services
  • Maintain backwards compatibility with --excluded-time-ranges flag
  • Add test coverage for nerc-rates integration

Closes #256

@jimmysway jimmysway force-pushed the fix/256-pulling-outages branch from e19501e to ef716a1 Compare October 31, 2025 19:16
@jimmysway
Copy link
Author

Based on previous discussions I think the service mapping was something like

OpenStack resources → stack
OpenShift resources → ocp-prod (or should it be both ocp-prod and ocp-test?)

Is there also other services that need to be included?

@knikolla
Copy link
Collaborator

Based on previous discussions I think the service mapping was something like

OpenStack resources → stack OpenShift resources → ocp-prod (or should it be both ocp-prod and ocp-test?)

Is there also other services that need to be included?

stack only applies to the OpenStack ColdFront Resource with name NERC
ocp-prod only applies to the OpenShift ColdFront Resource with name NERC-OCP
academic only applies to the OpenShift ColdFront Resource with name NERC-OCP-EDU

The rest aren't in ColdFront at the moment.

Please create a map as a global variable in the calculate_storage_gb_hours.py file to do the translation so you can make the appropriate call to nerc-rates.

@knikolla
Copy link
Collaborator

Created a related issue as a second step, after this PR is merged, to remove the PR and have the mapping be stored as attributes in the Resources themselves. For now, a dict map in calculate_storage_hours is fine.

#262

Copy link
Collaborator

@knikolla knikolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review in prior comments. Please implement a dictionary map to fetch the appropriate outage based on cluster name map.

@jimmysway jimmysway force-pushed the fix/256-pulling-outages branch from ef716a1 to cf42501 Compare October 31, 2025 20:27
- Add load_outages_from_nerc_rates() to fetch outages from nerc-rates repository
- Update calculate_storage_gb_hours to use mapping dict that matches the ColdFront Resource name to the nerc-rates service name
- Maintain backwards compatibility with --excluded-time-ranges flag
- Add test for nerc-rates integration

Closes nerc-project#256
@jimmysway jimmysway force-pushed the fix/256-pulling-outages branch from cf42501 to f687493 Compare October 31, 2025 20:28
@jimmysway jimmysway requested a review from knikolla October 31, 2025 20:29
logger.debug(msg)
logger.debug(f"Starting billing for allocation {allocation_str}.")

if not options["excluded_time_ranges"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code block is repeated exactly (or almost identically) down below, so I don't think is quite the right place to implement this.


On line 243, you have the following code block

if options["excluded_time_ranges"]:
            excluded_intervals_list = utils.load_excluded_intervals(
                options["excluded_time_ranges"]
            )
        else:
            excluded_intervals_list = None

This would be a better place for loading the outages. You can split that out from there and define a function as follow.

@functools.cache()
def get_outages_for_service(service):
    if options["excluded_time_ranges"]:
        return utils.load_excluded_intervals(
            options["excluded_time_ranges"]
        )
    else:
        return utils.load_outages_from_nerc_rates(
            options["start"],
            options["end"],
            RESOURCE_NAME_TO_NERC_SERVICE[service_name]
        )

You'll need to pass in options or allow the function to fetch them from either other functions or global variables. This will also cache the result.


process_invoice_row(allocation, attrs, su_name, rate) on line 211, takes the allocation as a parameter. Therefore you are aware of the allocation, and by accessing allocation.resources.first().name you can get the associated Resource name of that allocation, which you can use to call the get_outages_for_service(service).


@knikolla
Copy link
Collaborator

knikolla commented Nov 3, 2025

@jimmysway actually, seeing the asymmetry between providing CLI args (apply to all resources) and loaded from nerc-rates (apply to the specific resources as listed in the specific outage,) I'm okay with dropping the possibility to provide excluded intervals from the CLI.

@jimmysway
Copy link
Author

@jimmysway actually, seeing the asymmetry between providing CLI args (apply to all resources) and loaded from nerc-rates (apply to the specific resources as listed in the specific outage,) I'm okay with dropping the possibility to provide excluded intervals from the CLI.

@knikolla Are there any downstream consequences for something like this that I should be aware of?

@knikolla
Copy link
Collaborator

knikolla commented Nov 3, 2025

@jimmysway actually, seeing the asymmetry between providing CLI args (apply to all resources) and loaded from nerc-rates (apply to the specific resources as listed in the specific outage,) I'm okay with dropping the possibility to provide excluded intervals from the CLI.

@knikolla Are there any downstream consequences for something like this that I should be aware of?

No, the command is either executed manually through a terminal, or through a cronjob that doesn't provide any excluded intervals when executing the command.

EDIT TO ADD: Furthermore the command is very specific to how we bill, so I'm not worried about potentially breaking someone who might be using it.

Comment on lines 93 to 95
@patch("coldfront_plugin_cloud.utils.load_outages_from_nerc_rates")
def test_calculate_storage_with_nerc_rates_outages(self, mock_load_outages):
"""Test that nerc-rates outages are loaded based on resource name mapping."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building on @knikolla's previous feedback, I would suggest you remove this test, and mock load_outages_from_nerc_rates in test_new_allocation_quota. You can then make your assertions on mock_load_outages there as you see fit

Removed option to has outages via CLI
@jimmysway jimmysway force-pushed the fix/256-pulling-outages branch from 2bc9771 to 634e3f2 Compare November 3, 2025 19:58
Moved `load+from_url` call outside function so outages load once per run.
Copy link
Collaborator

@knikolla knikolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the current testing is sufficient. You haven't added any new test while adding functionality.

Please add a test that mocks nerc-rates and verifies that get_outages_during is called with the appropriate parameters. You can patch the mapping dictionary to make things easier for resources that don't exist.

You can also test that the mocked output of get_outages_during is correctly used.

… up mock outages, creates test data, verifies call is called with correct parameters, verifies output is actually reflects reduced hours
@jimmysway jimmysway force-pushed the fix/256-pulling-outages branch from d307f6b to 01d2815 Compare November 4, 2025 20:30
Copy link
Collaborator

@knikolla knikolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final comment, otherwise looks ready to go. Good work.

mock_load_outages.return_value = mock_outages
mock_rates_loader.return_value.get_value_at.return_value = Decimal("0.001")

resource = self.new_openstack_resource(name="NERC")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a different name and patch the dictionary that does the translation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on an update because I realised I am just mocking the function load_outages_from_nerc_rates and we only verified that the wrapper was called successfully. I think I mocking it one level lower is better because then the load_outages_from_nerc_rates actually runs so that the nerc-rates get_outages_during method gets called with correct ISO-formatted dates and service name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement support for pulling outages from nerc-rates

3 participants