Skip to content

Conversation

@marcelloceschia
Copy link

@marcelloceschia marcelloceschia commented Feb 10, 2026

This PR adds full support for AWS Bedrock inference profiles (both application-inference-profile and inference-profile ARNs) with automatic model resolution to detect underlying model capabilities. This fixes an issue where prompt caching was not being enabled for inference profile ARNs because the extension couldn't
determine the underlying model's capabilities.

Implementation

The implementation includes:

  1. New Bedrock Inference Profile Resolver (src/api/providers/bedrock-inference-profile-resolver.ts)

    • Added @aws-sdk/client-bedrock dependency for AWS Bedrock API access
    • Resolves application inference profile ARNs to underlying model ARNs
    • Resolves standard inference profile ARNs to model IDs
    • Caches resolved models to minimize API calls
    • Handles AWS credential configuration automatically
  2. Enhanced Bedrock Provider (src/api/providers/bedrock.ts)

    • Integrated inference profile resolver into the Bedrock provider
    • Automatic detection of model capabilities (prompt caching, extended context, reasoning budgets) from resolved models
    • Backward compatible - works with direct model IDs and ARNs
  3. Improved Settings UI

    • Added real-time ARN resolution feedback in the Bedrock custom ARN settings (webview-ui/src/components/settings/providers/BedrockCustomArn.tsx)
    • Shows resolved model information when entering inference profile ARNs
    • Enhanced user experience with visual feedback during resolution
  4. Comprehensive Test Coverage (src/api/providers/__tests__/)

    • Unit tests for both application inference profiles and standard inference profiles
    • Tests for caching behavior and error handling
    • Tests for capability detection from resolved models

Screenshots

before after
Inference profile ARNs were treated as unknown models, prompt caching disabled Inference profiles are resolved to underlying models with full capability detection new resolved model overview

How to Test

  1. Configure AWS Bedrock provider in settings
  2. Use the "Custom ARN" option and enter an inference profile ARN, such as:
    • Application inference profile: arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-profile
    • Standard inference profile: arn:aws:bedrock:us-west-2:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0
  3. Verify that the UI shows the resolved model information
  4. Start a conversation and confirm that prompt caching works correctly (check API usage logs)
  5. Verify that model capabilities (extended context, reasoning budgets) are correctly detected

@changeset-bot
Copy link

changeset-bot bot commented Feb 10, 2026

🦋 Changeset detected

Latest commit: 8f26ece

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
kilo-code Minor
@roo-code/types Patch
@kilocode/cli Patch
@roo-code/web-evals Patch
@roo-code/web-roo-code Patch
@kilocode/agent-runtime Patch
@roo-code/cloud Patch
@kilocode/core-schemas Patch
@roo-code/core Patch
@roo-code/evals Patch
@roo-code/ipc Patch
@roo-code/telemetry Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

marcelloceschia and others added 3 commits February 10, 2026 18:47
Synced with latest main branch changes including:
- Version updates to 5.6.0
- Agent manager improvements
- Slovak translation additions
- Various bug fixes and improvements

Resolved conflicts:
- CHANGELOG.md: Added 5.6.1 entry for Bedrock inference profile support
- src/package.json: Updated to version 5.6.0 from main

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@Githubguy132010
Copy link
Contributor

Not needed perse, but would prefer to have a more updated model as default instead of sonnet 3.5.

@marcelloceschia
Copy link
Author

I will update the list after this is merged. Our aws solution architect gave us also an hint how the 1M context window works for sonnet 4.5 and opus 4.5

@Githubguy132010
Copy link
Contributor

Githubguy132010 commented Feb 10, 2026

I will update the list after this is merged. Our aws solution architect gave us also an hint how the 1M context window works for sonnet 4.5 and opus 4.5

Alright LGTM then.

@marcelloceschia
Copy link
Author

marcelloceschia commented Feb 12, 2026

Thank you @kevinvandijk for taking your time to review this changes. Feel free to suggest further changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants