Skip to content

Conversation

twoGiants
Copy link
Member

@twoGiants twoGiants commented Sep 30, 2025

Changes

This PR takes the state of PR #8825, starting from commit f817655 (actually 087c5a5 but the first is the big important one) and adds the re-work from the second review of @twoGiants on top of it.

All resolved comments are marked with a 👍 and a "resolved" comment.

What is left to do?

  • add back in InitializeFromConfigMap and implement functionality => maybe in Resolver.Initialize
  • add comment which test case covers cache misses implicitly: pkg/remoteresolution/resolver/cluster/resolver_test.go:TestResolve
  • encapsulate annotated resource => package scope
  • fix e2e tests
  • merge cache, cache/injection, framework/cache packages into => framework/cache/**
  • add/fix unit tests
  • resolve all of your TODO comments
  • implement config watching for cache

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • Has Tests included if any functionality added or changed
  • pre-commit Passed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

NONE

@tekton-robot tekton-robot added the release-note-none Denotes a PR that doesnt merit a release note. label Sep 30, 2025
@tekton-robot tekton-robot requested a review from jerop September 30, 2025 14:11
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign jeromeju after the PR has been reviewed.
You can assign the PR to them by writing /assign @jeromeju in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 30, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@tekton-robot tekton-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 30, 2025
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/remoteresolution/cache/injection/cache.go Do not exist 0.0%
pkg/remoteresolution/resolver/framework/cache.go Do not exist 0.0%
pkg/remoteresolution/resolver/framework/reconciler.go 75.3% 73.8% -1.6
pkg/remoteresolution/resolver/http/resolver.go 88.9% 100.0% 11.1
pkg/resolution/resolver/bundle/bundle.go 71.1% 71.4% 0.4
pkg/resolution/resolver/bundle/params.go 87.9% 86.1% -1.8
pkg/resolution/resolver/cluster/resolver.go 80.4% 79.9% -0.6
pkg/resolution/resolver/git/resolver.go 84.9% 84.3% -0.6

@waveywaves
Copy link
Member

waveywaves commented Sep 30, 2025

@twoGiants I have been updating the original PR

@waveywaves
Copy link
Member

I will contribute here instead

@vdemeester vdemeester added this to the v1.6.0 (LTS) milestone Oct 7, 2025
@waveywaves waveywaves added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 13, 2025
brianwcook and others added 17 commits October 14, 2025 10:29
This commit adds caching for bundle and git resolvers to reduce
chances of being rate limited by registries and git forges.

- Add cache interface and in-memory implementation
- Add cache configuration options to docs
- Add a simple 'always on or off' option for cluster resolver
- Add documentation for cache configuration

Signed-off-by: Brian Cook <[email protected]>
- Remove dead GetGlobalCache function and globalCache variable
- Update tests to use dependency injection cache instead of global cache
- Add cache clearing in test setup to ensure test isolation
- Fix all GetGlobalCache references in cache_test.go and resolver tests

This resolves the shared cache state issue that was causing e2e test
failures (TestPropagatedParams, TestLargerResultsSidecarLogs) by ensuring
each test gets a clean cache state.
Summary:
- changed year in new files to 2025
- change `key` to package scope
- change constants to package scope
- remove err from GenerateCacheKey
- remove error return from GenerateCacheKey
  - => now many tests and code is broken
- revert `BundleResolverName` removal
- revert rename to resolver_test.go
- remove commented test
- simplify if condition

Signed-off-by: Stanislav Jakuschevskij <[email protected]>
Summary:
- removed unused constants
- changed encapsulation to package scope on constants and funcs which
aren't used outside
- removed unused functions and tests
- inlined a few functions
- changed `CacheAwareResolver` to `ImmutabilityChecker`
  - simplified interface method signature
  - moved test to cache_test.go
- removed commented code
- restored deleted tests for bundle resolver_test.go

Signed-off-by: Stanislav Jakuschevskij <[email protected]>
- rename singleton cache constructor
- put back `InitializeFromConfigMap` and `TestInitializeFromConfigMap`
- extract strings into constants and encapsulate to package scope
- cleanup remoteresolution resolvers
  - put interface implemtation checks for Resolver at the top
  - add Resolver interface implemtation check where it was missing
  - remove "string" type from string constants
  - use constants for `GetName` and `GetConfigName`
  - use "_" for unused method parameters
  - rearrange methods to (mostly) resemble Resolver interface method order
- delete useless tests
- remove redundant comments
- change switch to if in `ValidateCacheMode` and removed string return value
- add `// TODO` reminders

Signed-off-by: Stanislav Jakuschevskij <[email protected]>
- set generateCacheKey to package scope and remove unused error return
- rewrite Get, Add, Remove, Clear to generate the key from function parameters
  - add `DEPRECATED_*` prefix to legacy methods
- create `RunCommonCacheOperations` in cache framework and use in resolvers
- remove duplication for bundle, cluster and git resolvers

Signed-off-by: Stanislav Jakuschevskij <[email protected]>
- the cluster resolver's Initialize method now loads cache configuration from a ConfigMap if available,
- if the ConfigMap is not found, the cache falls back to default settings ensuring backward compatibility
- doc added to clarify all test cases in TestResolve function implicitly test cache miss scenarios since they represent first-time resource resolution attempts
- add sync.Once mechanism to injection package for thread-safe initialization
- create InitializeSharedCache function to ensure cache is initialized only once
- update cluster resolver to call InitializeSharedCache instead of direct InitializeFromConfigMap
- replace TODO comment with proper documentation explaining safe usage pattern
- prevent cache recreation and data loss when multiple resolvers initialize

Signed-off-by: Vibhav Bobade <[email protected]>
- populate empty expectedKey fields with correct SHA256 hash values
- fix linting error by changing t.Error to t.Errorf for format strings
- remove TODO comment indicating broken test since tests now pass

Signed-off-by: Vibhav Bobade <[email protected]>
- add sleepDuration constant to test/util.go for DAG test timing validation
- add ignorePipelineRunStatus and ignoreTaskRunStatus variables for status comparison tests
- fix whitespace linting issue in cache_test.go by removing unnecessary leading newline
- variables were accidentally removed in commit 2b30061 during cache refactoring

Signed-off-by: Vibhav Bobade <[email protected]>
- set default resolveRequestFunc when nil in bundle resolver Resolve method
- fix resolverFake IsImmutable to check bundle digest instead of always returning true
- ensure bundle resolver works correctly when Initialize is not called first
- fix TestShouldUseCachePrecedence test case for auto mode with tags

Signed-off-by: Vibhav Bobade <[email protected]>
- remove duplicate clearCache function from resolver_cache_test.go
- enhance clearResolverCaches in util.go with cache verification logic
- update all clearCache calls to use clearResolverCaches for consistency

Signed-off-by: Vibhav Bobade <[email protected]>
- change AnnotatedResource type to unexported annotatedResource
- change NewAnnotatedResource function to unexported newAnnotatedResource
- update resolver tests to check annotations directly without type assertion
- remove TODO comment in resolver_test.go

Signed-off-by: Vibhav Bobade <[email protected]>
- remove unused import to fix linting error

Signed-off-by: Vibhav Bobade <[email protected]>
@waveywaves waveywaves force-pushed the brianwcook-resolver-cache branch from 05fe2d2 to 4bc9258 Compare October 14, 2025 04:59
- initialize shared resolver cache once in cmd/resolvers/main.go at startup
- remove sync.Once mechanism from cache injection layer
- remove cache initialization logic from cluster resolver Initialize method
- update comments to reflect new initialization approach
- simplify code by ensuring cache is initialized before any resolvers start

Signed-off-by: Vibhav Bobade <[email protected]>
Add defensive error handling to verify resolvers are properly initialized
before Resolve() is called, making the contract explicit and preventing
potential nil pointer dereferences.

- add nil checks in bundle resolver for kubeClientSet
- add nil checks in cluster resolver for pipelineClientSet
- add nil checks in git resolver for kubeClient, logger, and cache
- add nil checks in http resolver for kubeClient and logger
- update unit tests to properly initialize resolvers before testing
- use pattern of initializing with full context, then overlaying test context

The framework already guarantees Initialize() is called, but these checks
provide extra safety and clearer error messages if the contract is violated.

Signed-off-by: Vibhav Bobade <[email protected]>
Enhance e2e tests to verify cache behavior through resolver pod logs
in addition to checking annotations, providing more comprehensive
validation that caching is working correctly.

- add getResolverPodLogs helper to retrieve logs from tekton-resolvers pod
- add log verification to TestBundleResolverCache to check for cache operations
- verify "Cache miss" and "Adding to cache" logs on first request
- verify "Cache hit" log on second request
- import resolverconfig and system packages for namespace resolution

Signed-off-by: Vibhav Bobade <[email protected]>
Run gofmt to fix import grouping and alphabetical ordering.

Signed-off-by: Vibhav Bobade <[email protected]>
@waveywaves
Copy link
Member

/retest

2 similar comments
@waveywaves
Copy link
Member

/retest

@waveywaves
Copy link
Member

/retest

Add comprehensive test coverage for cache framework components to ensure
robust validation and operation of the resolver cache system. Tests verify
cache mode validation, cache operations, and injection mechanisms.

- add tests for ValidateCacheMode function covering all valid/invalid modes
- add tests for RunCommonCacheOperations covering cache hit/miss/error scenarios
- add tests for GetResolverCache and InitializeSharedCache in injection package
- add tests for invalid cache parameter handling (defaults to auto mode)
- fix import formatting and remove duplicate v1 import
- fix context usage to use t.Context() instead of context.Background()
- increase cache.go coverage from 0% to 100%
- increase injection/cache.go coverage from 0% to 63.6%
- increase overall framework package coverage to 79.6%

Signed-off-by: Vibhav Bobade <[email protected]>
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, still some questions/comments and todos

  • There is still some TODOs (the DEPRECATED functions in pkg/remoteresolution/cache package)
  • I am not a huge fan of loading the configmap when we start and not do anything else. I would prefer if we do the same as we do for the other configmap, we watch it from the controller, and we adapt from changes.
    • As it is today, if you change the configuration for cache, you need to restart the deployment of resolvers
    • If someone wants to enable cache on a running instance, same then, a manual intervention is needed


// Initialize shared resolver cache from ConfigMap
logger := logging.FromContext(ctx)
kubeClient, err := kubernetes.NewForConfig(cfg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already have parsed and loaded a client from cfg := injection.ParseAndGetRESTConfigOrDie(), so in theory you could do the following kubeclientset := kubeclient.Get(ctx) to get the client.

import (
// [...]
	kubeclient "knative.dev/pkg/client/injection/kube/client"
// [...]
)

func main() {
// [...]
	kubeclientset := kubeclient.Get(ctx)
// [...]
}

Comment on lines 82 to 89
resolverNS := resolverconfig.ResolversNamespace(system.Namespace())
configMap, err := kubeClient.CoreV1().ConfigMaps(resolverNS).Get(ctx, "resolver-cache-config", metav1.GetOptions{})
if err != nil {
logger.Debugf("Could not load resolver-cache-config ConfigMap: %v. Using default cache configuration.", err)
} else {
cacheinjection.InitializeSharedCache(configMap)
logger.Info("Initialized resolver cache from ConfigMap")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading it here and that way, means a change in the configmap would change the behavior, which is kind-of inconsistent with the other parts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I removed this and implemented a config map watcher for the cache. The same as in the resolvers. It's not yet tested and has probably a few bugs, but the idea should be correct now.

type key struct{}

// sharedCache is the shared cache instance used across all contexts
var sharedCache = cache.NewResolverCache(cache.DefaultCacheSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we also initialize this, no matter if the instance is configured with cache or not.

Comment on lines 103 to 106
// Check for :<tag>@sha256: pattern
if strings.Contains(bundleRef, ":") && strings.Contains(bundleRef, "@sha256:") {
return true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this is here as it would have already been taken care of in the previous case ?

Comment on lines 96 to 101
// Checks if the given string looks like an OCI pull spec by digest.
// A digest is typically in the format of @sha256:<hash> or :<tag>@sha256:<hash>
// Check for @sha256: pattern
if strings.Contains(bundleRef, "@sha256:") {
return true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the digest is always at the end, we could also have a regexp, but it might be overkill as I think the @ character is not allowed in the image reference

Comment on lines 118 to 120
if r.kubeClientSet == nil {
return nil, errors.New("bundle resolver not properly initialized: Initialize() must be called before Resolve()")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a very defensive check, do we really need it ?

Comment on lines 91 to 94
// Verify resolver was initialized - pipelineClientSet must be set by Initialize()
if r.pipelineClientSet == nil {
return nil, errors.New("cluster resolver not properly initialized: Initialize() must be called before Resolve()")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, not sure it is needed.

TTL: r.ttl,
Params: params,
// IsImmutable implements ImmutabilityChecker.IsImmutable
// Returns true if the revision parameter is a commit SHA (40-character hex string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick note : as of today, a full git SHA is SHA-1, but tomorrow (git v3), it would/could but SHA-256, so we could try to take this case into account today already ? (or create a follow-up issue)

@twoGiants twoGiants changed the title 🚧 👷 Re-work on "Add shared cache for resolvers" 🚧 🏗️ [WIP] 🚧 👷 Re-work on "Add shared cache for resolvers" 🚧 🏗️ [WIP] Oct 17, 2025
@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 17, 2025
Consolidate cache implementation into
`pkg/remoteresolution/resolver/framework/cache` and refactor to follow
Tekton manual injection patterns:
- Merge pkg/remoteresolution/cache, pkg/remoteresolution/cache/injection
  and pkg/remoteresolution/resolver/cache into
  pkg/remoteresolution/resolver/framework/cache package
- Reorganize into setup.go (creation, injection), operations.go (usage)
  and cache.go (core api)
- Implement lazy singleton initialization with sync.Once
- Remove unused functions / methods
- Inline `useCache` method

Remove deprecated test helpers:
- Remove unused "test-verification-key" from clearResolveCache
- Remove duplicated "ignore*" test variables

Signed-off-by: Stanislav Jakuschevskij <[email protected]>
Signed-off-by: Stanislav Jakuschevskij <[email protected]>
Replace manual `ConfigMap` loading with automatic watching that reacts to
configuration changes without requiring deployment restart.

- Add `CacheConfigStore` for watching cache config changes
- Implement `ConfigWatcher` interface in `resolverCache` to get the
  cache `ConfigMap` name
- Encapsulate `TTL` in `resolverCache` struct
- Add thread-safe cache reinitialization with mutex on config changes
- Setup cache config watching per controller with `sync.Once` guard in
  `CacheConfigStore.WatchConfig` method
- Remove manual `ConfigMap` loading from `main.go`

`ConfigMap` keys: "max-size" (int) and "ttl" (duration string)
Cache is cleared when configuration changes.

Signed-off-by: Stanislav Jakuschevskij <[email protected]>

type cacheConfigKey struct{}

type cacheConfig struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twoGiants any reason not to use https://pkg.go.dev/knative.dev/pkg/configmap ? This is more or less what we use on other configmaps ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right! One of us will fix it. First come, first serve @waveywaves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. release-note-none Denotes a PR that doesnt merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

5 participants