Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add system API endpoints for cycle cost calculation #4110

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

michael-weigelt
Copy link

@michael-weigelt michael-weigelt requested a review from a team as a code owner January 27, 2025 15:31
@github-actions github-actions bot added the interface-spec Changes to the IC Interface Specification label Jan 27, 2025
Copy link

github-actions bot commented Jan 27, 2025

🤖 Here's your preview: https://c4y23-xqaaa-aaaam-abayq-cai.icp0.io

- body
- transform - i.e., the sum of the transform method name length and the length of the transform context

`max_res_bytes` is the maximum response length the caller wishes to accept.
Copy link
Contributor

@mraszyk mraszyk Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`max_res_bytes` is the maximum response length the caller wishes to accept.
`max_res_bytes` is the maximum response length the caller wishes to accept (the caller should provide the default value of `2,000,000` if no maximum response length is provided in the actual request).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this syscall, the max_res_bytes is not optional, so there is no default value here. If the user follows the documentation of http_request, they will get the full picture. But I think here this formulation is misleading.

In other words: The function signature tells the caller that they MUST provide this data explicitly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a sentence to clarify, let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this syscall, the max_res_bytes is not optional, so there is no default value here.

There is a default value in the replica implementation that the caller has to supply here because system API doesn't allow easily modelling optional values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a sentence to clarify, let me know what you think.

I don't see it. Could you please share a link?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion: We can specify that the passed value is clamped to the true replica's maximum. So if the user does not want to bound the max_response_bytes in the call to the management canister, they can supply u64::MAX here and we clamp it to the true 2_000_000 (or whatever this evolves to).

This would save the caller from having to know the true maximum when they only want to call mgmt with None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a reasonable suggestion, but I'll bring this up in the spec meeting to gather more feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased the suggestion to

(the caller should provide the default value of `2,000,000` if no maximum response length is provided in the actual request).

(as discussed separately)

Copy link
Contributor

@mraszyk mraszyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also add functions into the formal semantics: the actual return value can be abstracted over as arbitrary(), but the rest should follow, e.g., ic0.canister_cycles_balance128.

@oggy-dfin
Copy link
Member

Here's an idea for a different approach:

  1. We introduce just two new API functions, ic0.cost_new, and ic0.cost_perform, that correspond to call_new and call_perform; the idea is that we simulate executing the call
  2. cost_new takes the same arguments as call_new
  3. Between cost_new and cost_perform, we can either allow ic0.call_data_append or a new ic0.cost_data_append function, whatever is simpler
  4. cost_perform would return two values, the cost in cycles, and the number of cycles that should be attached to the call. We could return this as two pairs of i64 values, or by specifying the destination on the heap, whatever is deemed simpler.

The advantages as I see them:

  1. Simpler implementation for the CDKs and/or users (depending on how we choose to expose this in the CDKs). They don't need to understand or do anything special for the different call types.
  2. No API additions needed in the future (unlike the current proposal where we'd have to add new System API calls whenever we add new functionality to the management canister)
  3. The cost computation is as precise as possible

The only downside that I see is that the cost of the cost_* functions themselves can't be determined in advance, but I think that will be the case whenever we need to parse the payload.

@michael-weigelt
Copy link
Author

michael-weigelt commented Jan 31, 2025

@oggy-dfin thanks for the input. I like that this approach leads to a smaller API surface and covers current and future use cases.

The problem I see is with parsing the payload, as you have pointed out yourself: This is cheap and constant effort for the ecdsa, schnorr and vetkey use cases. But for http_request, we have to get max_response_bytes out of the payload, which means copying and deserializing a user-provided blob of up to 2MB. So the cost of cost_* scales with the payload, and might be significant (it takes ~900M (EDIT: ~11M) instructions to deserialize 1.9MB).
It also means that we have to depend on candid and the mgmt types in the system API impl.

So we would buy extensibility with 1) more complexity in the impl and 2) more costs to the caller. But is extensibility really worth that? I would consider adding a new system API endpoint of an existing type is a small effort already. Although granted, it is less pleasant than an API that just covers it by design.

Another downside of simulating a call is that you have to have your payload ready to know the cost. If you have access to the proposed API, where you can just provide some payload lengths, you can make decisions based on costs before deciding on the details of a call. Not sure how interesting this is in reality, but it would suck if people had to compose some dummy data just to predict the cost of a similar call.

@mraszyk
Copy link
Contributor

mraszyk commented Jan 31, 2025

~900M instructions to deserialize 1.9MB

is this benchmark using serde_bytes for decoding Vec?

It also means that we have to depend on candid and the mgmt types in the system API impl.

a dependency on candid already exists in the system-api crate for the sake of routing

more complexity in the impl

this complexity can't be avoided, it can only be pushed to the CDKs

@michael-weigelt
Copy link
Author

is this benchmark using serde_bytes for decoding Vec?

I measured the instructions on a canister that turns a Vec into a struct using candid::Decode!(). Whether candid uses serde_bytes itself, idk.

this complexity can't be avoided, it can only be pushed to the CDKs

Yes. But there, the precise call type is already known without inspecting a blob.

@mraszyk
Copy link
Contributor

mraszyk commented Jan 31, 2025

I measured the instructions on a canister that turns a Vec into a struct using candid::Decode!(). Whether candid uses serde_bytes itself, idk.

serde_bytes are used if the type you want to decode contains corresponding serde_bytes annotation. Could you please share the type definition you're decoding to?

@oggy-dfin
Copy link
Member

So we would buy extensibility with 1) more complexity in the impl and 2) more costs to the caller. But is extensibility really worth that? I would consider adding a new system API endpoint of an existing type is a small effort already. Although granted, it is less pleasant than an API that just covers it by design.

I think what rubs me the wrong way about the current proposal and the problem of possible future extensions is that the couples the System API (which is currently oblivious to the existence of the management canister and all other functionality) with these other special aspects of the system. Obviously my proposal would also couple them, just at the implementation level (by having to understand the payload) instead of the API level, and in some sense it's inevitable if we want to provide a synchronous API.

@michael-weigelt
Copy link
Author

While I agree with your analysis, I'd make a different trade-off. Having an API that is agnostic to the details has these disadvantages:

  • The system API impl gets a blob -> the logic of understanding mgmt canister payloads is now in the system API impl too. Note that if we push this to the caller, i.e., the CDK, it does not add more coupling, because the CDK already needs to understand these payloads anyway. But having a set of switch statements that fans out all the possible blob types in the system API impl is not pretty, especially considering the caller had the precise information available in the first place.
  • The difference in cost of passing a blob versus passing a few u64s is significant, both in terms of cycles and actual work performed by the replica.
  • Asking the user to provide a well-formed payload when they potentially just want to make a decision based on some sizes and lengths is not that ergonomic.


The cost of a canister http outcall via [`http_request`](#ic-http_request). `request_size` is the sum of the byte lengths of the following components of an http request:
- url
- method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- method

this is not part of the formula and it'd be unclear anyway how to compute the byte length of the enumeration variant


- `ic0.cost_sign_with_schnorr(src : I, size : I, algorithm: i32, dst : I) -> ()`; `I ∈ {i32, i64}`

- `ic0.cost_vetkd_derive_encrypted_key(src : I, vetkd_curve: i32, size : I, dst : I) -> ()`; `I ∈ {i32, i64}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't src and size be next to each other?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interface-spec Changes to the IC Interface Specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants