Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions deploy/charts/cyberark-disco-agent/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ data:
- type!=kubernetes.io/dockerconfigjson
- type!=bootstrap.kubernetes.io/token
- type!=helm.sh/release.v1
filters:
- ExcludeTLSSecretsWithoutClientCert
- kind: k8s-dynamic
name: ark/serviceaccounts
config:
Expand Down
23 changes: 23 additions & 0 deletions docs/datagatherers/k8s-dynamic.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,26 @@ when listing Secrets.
- type!=bootstrap.kubernetes.io/token
- type!=helm.sh/release.v1
```

## Filters

You can use filters to drop certain resources based on custom logic.
For example, you can drop TLS secrets that do not contain any client certificates using the `ExcludeTLSSecretsWithoutClientCert` filter, as shown below:

```yaml
- kind: "k8s-dynamic"
name: "k8s/secrets"
config:
resource-type:
version: v1
resource: secrets
filters:
- ExcludeTLSSecretsWithoutClientCert
```

The available filters are:
* `ExcludeTLSSecretsWithoutClientCert`: Drops TLS secrets that do not contain any client certificates.

If you find that the filters are not having the desired effect, you can enable
debug logging by setting the log-level to `debug`. This will log info about why
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor doc fix: the project uses numeric --log-level (0=Info, 1=Debug, 2=Trace; 6–9 HTTP trace). Suggest rewording to something like: "set --log-level=1 (Debug)" so users don’t try --log-level=debug.

certain resources are not being gathered.
2 changes: 2 additions & 0 deletions examples/machinehub.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ data-gatherers:
- type!=kubernetes.io/dockerconfigjson
- type!=bootstrap.kubernetes.io/token
- type!=helm.sh/release.v1
filters:
- ExcludeTLSSecretsWithoutClientCert

# Gather Kubernetes service accounts
- name: ark/serviceaccounts
Expand Down
139 changes: 138 additions & 1 deletion pkg/datagatherer/k8s/cache.go
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
package k8s

import (
"crypto/x509"
"encoding/base64"
"encoding/pem"
"fmt"
"time"

"github.com/go-logr/logr"
"github.com/pmylund/go-cache"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/types"

"github.com/jetstack/preflight/api"
Expand Down Expand Up @@ -39,9 +44,141 @@ func logCacheUpdateFailure(log logr.Logger, obj interface{}, operation string) {
log.Error(err, "Cache update failure", "operation", operation)
}

// cacheFilterFunction is a function that can be used to filter out objects
// that should not be added to the cache. If the function returns true, the
// object is filtered out.
type cacheFilterFunction func(logr.Logger, interface{}) bool

// excludeTLSSecretsWithoutClientCert filters out all TLS secrets that do not
// contain a client certificate in the `tls.crt` key.
// Secrets are obtained by a DynamicClient, so they have type
// *unstructured.Unstructured.
func excludeTLSSecretsWithoutClientCert(log logr.Logger, obj interface{}) bool {
// Fast path: type assertion and kind/type checks
unstructuredObj, ok := obj.(*unstructured.Unstructured)
if !ok {
log.V(4).Info("Object is not a Unstructured", "type", fmt.Sprintf("%T", obj))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be at level 1 or 2? The Helm chart says:

# The log levels are: 0=Info, 1=Debug, 2=Trace.

return false
}
if unstructuredObj.GetKind() != "Secret" || unstructuredObj.GetAPIVersion() != "v1" {
return false
}

log = log.WithValues("namespace", unstructuredObj.GetNamespace(), "name", unstructuredObj.GetName())

secretType, found, err := unstructured.NestedString(unstructuredObj.Object, "type")
if err != nil || !found || secretType != string(corev1.SecretTypeTLS) {
log.V(4).Info("Object is not a TLS Secret", "type", secretType)
return false
}

// Directly extract tls.crt from unstructured data (avoid conversion if possible)
dataMap, found, err := unstructured.NestedMap(unstructuredObj.Object, "data")
if err != nil || !found {
log.V(4).Info("Secret data missing or not a map")
return true
}
tlsCrtRaw, found := dataMap[corev1.TLSCertKey]
if !found {
log.V(4).Info("TLS Secret does not contain tls.crt key")
return true
}

// Decode base64 if necessary (K8s secrets store data as base64-encoded strings)
var tlsCrtBytes []byte
switch v := tlsCrtRaw.(type) {
case string:
decoded, err := base64.StdEncoding.DecodeString(v)
if err != nil {
log.V(4).Info("Failed to decode tls.crt base64", "error", err.Error())
return true
}
tlsCrtBytes = decoded
case []byte:
tlsCrtBytes = v
default:
log.V(4).Info("tls.crt is not a string or byte slice", "type", fmt.Sprintf("%T", v))
return true
}

// Parse PEM certificate chain
certs, err := parsePEMCertificateChain(tlsCrtBytes)
if err != nil || len(certs) == 0 {
log.V(4).Info("Failed to parse tls.crt as PEM encoded X.509 certificate chain", "error", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible nil dereference here: when parsePEMCertificateChain returns (nil, nil) (or len(certs) == 0), err is nil and err.Error() will panic

return true
}

// Check if the leaf certificate is a client certificate
if isClientCertificate(certs[0]) {
log.V(4).Info("TLS Secret contains a client certificate")
return false
}

log.V(4).Info("TLS Secret does not contain a client certificate")
return true
}

// isClientCertificate checks if the given certificate is a client certificate
// by checking if it has the ClientAuth EKU.
func isClientCertificate(cert *x509.Certificate) bool {
if cert == nil {
return false
}
// Check if the certificate has the ClientAuth EKU
for _, eku := range cert.ExtKeyUsage {
if eku == x509.ExtKeyUsageClientAuth {
return true
}
}
return false
}

// parsePEMCertificateChain parses a PEM encoded certificate chain and returns
// a slice of x509.Certificate pointers. It returns an error if the data cannot
// be parsed as a certificate chain.
// The supplied data can contain multiple PEM blocks, the function will parse
// all of them and return a slice of certificates.
func parsePEMCertificateChain(data []byte) ([]*x509.Certificate, error) {
// Parse the PEM encoded certificate chain
var certs []*x509.Certificate
var block *pem.Block
rest := data
for {
block, rest = pem.Decode(rest)
if block == nil {
break
}
if block.Type != "CERTIFICATE" || len(block.Bytes) == 0 {
continue
}
cert, err := x509.ParseCertificate(block.Bytes)
if err != nil {
return nil, fmt.Errorf("failed to parse certificate: %w", err)
}
certs = append(certs, cert)
}
if len(certs) == 0 {
return nil, fmt.Errorf("no certificates found")
}
return certs, nil
}

// onAdd handles the informer creation events, adding the created runtime.Object
// to the data gatherer's cache. The cache key is the uid of the object
func onAdd(log logr.Logger, obj interface{}, dgCache *cache.Cache) {
// The object is wrapped in a GatheredResource struct.
// If the object is already present in the cache, it gets replaced.
// The cache key is the uid of the object
// The supplied filter functions can be used to filter out objects that
// should not be added to the cache.
// If multiple filter functions are supplied, the object is filtered out
// if any of the filter functions returns true.
func onAdd(log logr.Logger, obj interface{}, dgCache *cache.Cache, filters ...cacheFilterFunction) {
for _, filter := range filters {
if filter != nil && filter(log, obj) {
return
}
}

item, ok := obj.(cacheResource)
if ok {
cacheObject := &api.GatheredResource{
Expand Down
29 changes: 28 additions & 1 deletion pkg/datagatherer/k8s/dynamic.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,11 @@ type ConfigDynamic struct {
IncludeNamespaces []string `yaml:"include-namespaces"`
// FieldSelectors is a list of field selectors to use when listing this resource
FieldSelectors []string `yaml:"field-selectors"`
// Filters is a list of filter functions to apply to the resources before adding them to the cache.
// Each filter function should return true if the resource should be excluded, false otherwise.
// Available filter functions:
// - ExcludeTLSSecretsWithoutClientCert: ignores all TLS secrets that do not contain client certificates
Filters []cacheFilterFunction `yaml:"filters"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config shape nit: Filters on ConfigDynamic stores function pointers. That works after unmarshalling, but it makes the config awkward to marshal/log and couples config with runtime state. Consider storing []string on the config (the names), and mapping them to functions during construction (e.g., in NewDataGatherer).

}

// UnmarshalYAML unmarshals the ConfigDynamic resolving GroupVersionResource.
Expand All @@ -57,6 +62,7 @@ func (c *ConfigDynamic) UnmarshalYAML(unmarshal func(interface{}) error) error {
ExcludeNamespaces []string `yaml:"exclude-namespaces"`
IncludeNamespaces []string `yaml:"include-namespaces"`
FieldSelectors []string `yaml:"field-selectors"`
Filters []string `yaml:"filters"`
}{}
err := unmarshal(&aux)
if err != nil {
Expand All @@ -71,6 +77,15 @@ func (c *ConfigDynamic) UnmarshalYAML(unmarshal func(interface{}) error) error {
c.IncludeNamespaces = aux.IncludeNamespaces
c.FieldSelectors = aux.FieldSelectors

for _, filterName := range aux.Filters {
switch filterName {
case "ExcludeTLSSecretsWithoutClientCert":
c.Filters = append(c.Filters, excludeTLSSecretsWithoutClientCert)
default:
return fmt.Errorf("filters contains an unknown filter function: %s. Must be one of: ExcludeTLSSecretsWithoutClientCert", filterName)
}
}

return nil
}

Expand Down Expand Up @@ -107,6 +122,9 @@ type sharedInformerFunc func(informers.SharedInformerFactory) k8scache.SharedInd

// kubernetesNativeResources map of the native kubernetes resources, linking each resource to a sharedInformerFunc for that resource.
// secrets are still treated as unstructured rather than corev1.Secret, for a faster unmarshaling
//
// TODO(wallrj): What does "faster unmarshaling" mean in this context? If
// unstructured is faster then why not use it for all resources?
var kubernetesNativeResources = map[schema.GroupVersionResource]sharedInformerFunc{
corev1.SchemeGroupVersion.WithResource("pods"): func(sharedFactory informers.SharedInformerFactory) k8scache.SharedIndexInformer {
return sharedFactory.Core().V1().Pods().Informer()
Expand Down Expand Up @@ -144,6 +162,15 @@ var kubernetesNativeResources = map[schema.GroupVersionResource]sharedInformerFu
}

// NewDataGatherer constructs a new instance of the generic K8s data-gatherer for the provided
// configuration.
//
// If the GroupVersionResource is a native Kubernetes resource, the data
// gatherer will use a typed clientset and SharedInformerFactory, otherwise it
// will use a dynamic client and dynamic informer factory, for CRDs like those
// of cert-manager.
//
// Secret is a special case, it is a native resource but it will be treated as unstructured
// rather than corev1.Secret, for "faster unmarshaling".
func (c *ConfigDynamic) NewDataGatherer(ctx context.Context) (datagatherer.DataGatherer, error) {
if isNativeResource(c.GroupVersionResource) {
clientset, err := NewClientSet(c.KubeConfigPath)
Expand Down Expand Up @@ -218,7 +245,7 @@ func (c *ConfigDynamic) newDataGathererWithClient(ctx context.Context, cl dynami

registration, err := newDataGatherer.informer.AddEventHandlerWithOptions(k8scache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
onAdd(log, obj, dgCache)
onAdd(log, obj, dgCache, c.Filters...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filters are applied on add events but not on updates. If a TLS Secret’s EKU changes (e.g., client → server-only), the cache will retain an entry that should now be excluded.

Proposal:

  • Pass filters to onUpdate as well. If any filter excludes newObj, remove it from the cache; if it was previously excluded but now passes, add/replace it.
  • Add tests for both transitions (client→non-client and non-client→client) to ensure cache consistency.

},
UpdateFunc: func(oldObj, newObj interface{}) {
onUpdate(log, oldObj, newObj, dgCache)
Copy link
Member

@maelvls maelvls Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the filter also apply the the update func? I'm not sure, since I don't know how these AddFunc, UpdateFunc, DeleteFunc relate to the underlying client-go informer.
If it's not required, would be nice to have a comment explaining why the filtering only applies to AddFunc.

Also a test for that?

Expand Down
Loading