Skip to content

Option to filter tags by service or resource type in aws_tagging_resource table #2466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

thomasklemm
Copy link
Contributor

@thomasklemm thomasklemm commented Apr 8, 2025

Adds the option to filter the aws_tagging_resource table by resource types, e.g. ec2:instance,s3:bucket,auditmanager for limiting the response to only Amazon EC2 instances, Amazon S3 buckets, or any AWS Audit Manager resource.

Integration test logs

Logs
Add passing integration test logs here

Example queries

-- Filter for all tagged EC2 & RDS resources, plus S3 buckets
select arn from aws_tagging_resource where resource_types = '["ec2", "rds", "s3:bucket"]';
-- => Returns only tags for expected resources

-- FIlter for EC2 instances, RDS database instances and S3 buckets
select arn from aws_tagging_resource where resource_types = '["ec2:instance", "rds:db", "s3:bucket"]';
-- => Returns only tags for expected resources

-- Filter gets ignored when empty
select arn from aws_tagging_resource where resource_types = '[]';
-- => Returns tags for all tagged resources

-- Filter for all resource types supported
select count(*) from aws_tagging_resource where resource_types = '["access-analyzer","acm","acm-pca","airflow","amplify","apigateway","app-integrations","appconfig","appflow","appmesh","apprunner","appstream","appsync","aps","athena","auditmanager","backup","batch","ce","cloud9","cloudformation","cloudfront","cloudtrail","cloudwatch","codeartifact","codebuild","codecommit","codeconnections","codedeploy","codeguru-profiler","codeguru-reviewer","codepipeline","codestar-connections","cognito-identity","cognito-idp","comprehend","connect","databrew","dataexchange","datapipeline","datasync","dax","detective","devicefarm","dms","ds","dynamodb","ec2","ecr","ecr-public","ecs","eks","elasticache","elasticbeanstalk","elasticfilesystem","elasticloadbalancing","elasticmapreduce","emr-containers","emr-serverless","es","events","evidently","finspace","firehose","fis","forecast","frauddetector","fsx","gamelift","geo","glacier","globalaccelerator","glue","grafana","greengrass","groundstation","guardduty","healthlake","iam","imagebuilder","inspector","iot","iotanalytics","iotdeviceadvisor","iotevents","iotfleetwise","iotsitewise","iottwinmaker","iotwireless","ivs","ivschat","kafka","kendra","kinesis","kinesisanalytics","kinesisvideo","kms","lambda","lex","logs","lookoutmetrics","lookoutvision","m2","managedblockchain","mediapackage","mediapackage-vod","mediatailor","memorydb","mobiletargeting","mq","network-firewall","networkmanager","oam","omics","outposts","panorama","personalize","pipes","proton","qldb","quicksight","ram","rds","redshift","refactor-spaces","rekognition","resiliencehub","resource-explorer-2","resource-groups","route53","route53-recovery-control","route53-recovery-readiness","route53resolver","rum","s3","sagemaker","scheduler","schemas","secretsmanager","servicecatalog","servicediscovery","ses","signer","sns","sqs","ssm","states","storagegateway","synthetics","transfer","wisdom","workspaces","chatbot","config","organizations","payments","securityhub"]'

@misraved misraved requested review from Copilot and ParthaI April 8, 2025 21:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

@thomasklemm thomasklemm force-pushed the feat/resource-tags-filters branch from c7120ff to 8e75ea2 Compare April 9, 2025 06:56
@thomasklemm thomasklemm requested a review from Copilot April 9, 2025 07:27
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

@ParthaI
Copy link
Contributor

ParthaI commented Apr 11, 2025

Hi @thomasklemm, I was reviewing the code changes and had a few suggestions for improvement:

  • Would it make sense to use resource_types as the column name instead of resource_type_filter for better alignment with naming conventions?
  • Also, could we consider defining the column type as JSON instead of string?
  • This would allow us to use d.EqualsQuals["resource_types"].GetJsonbValue() and parse it directly as a slice of strings, rather than relying on comma-separated values.
  • For implementation reference, you might take a look at similar patterns used in the tables aws_cloudwatch_metric_data_point, aws_cloudwatch_metric, and aws_pricing_product.
  • Lastly, could you please include example queries in the table documentation demonstrating how to use the resource_types column as a filter?

Thanks!

@thomasklemm thomasklemm force-pushed the feat/resource-tags-filters branch from 8e75ea2 to 3c2a4eb Compare April 14, 2025 09:11
@ParthaI
Copy link
Contributor

ParthaI commented Apr 17, 2025

Hi @thomasklemm, just checking in to see if you had a chance to review the suggestions and comments above.

@thomasklemm
Copy link
Contributor Author

Hi @ParthaI, thanks for the detailed suggestions! I made them locally, need to test them in our cluster, would update the PR after

@ParthaI
Copy link
Contributor

ParthaI commented May 7, 2025

Hi @thomasklemm, just checking in to see if you’ve had a chance to test it out and push the changes based on your findings?

@thomasklemm thomasklemm force-pushed the feat/resource-tags-filters branch from 3c2a4eb to 9779b47 Compare May 8, 2025 09:52
@thomasklemm
Copy link
Contributor Author

thomasklemm commented May 8, 2025

Hi @ParthaI, made the changes and adjusted the initial query examples to match the resource_type = '[...]' syntax. Changes are working locally when I make the example queries. Also added some documentation for the feature in the w/ example queries in the docs.

Tried these queries and all looks correct:

select arn from aws_tagging_resource where resource_types = '["ec2", "rds", "s3:bucket"]' order by arn;
select count(arn) from aws_tagging_resource where resource_types = '["ec2", "rds", "s3:bucket"]';

select arn from aws_tagging_resource where resource_types = '["ec2:instance", "rds:db", "s3:bucket"]' order by arn;
select count(arn) from aws_tagging_resource where resource_types = '["ec2:instance", "rds:db", "s3:bucket"]';

select arn from aws_tagging_resource where resource_types = '[]' order by arn;
select arn from aws_tagging_resource order by arn;

select count(arn) from aws_tagging_resource where resource_types = '[]';
select count(arn) from aws_tagging_Resource;

@thomasklemm thomasklemm requested a review from Copilot May 8, 2025 10:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new filter option for the aws_tagging_resource table that allows filtering resources by specific AWS service or resource types using a JSON array of strings.

  • Added documentation in aws_tagging_resource.md to explain the new filtering option with examples.
  • Modified aws/table_aws_tagging_resource.go to support a new key column "resource_types" and to parse the JSON array of resource types from query qualifiers.
  • Enhanced error handling for invalid JSON input in resource_types.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docs/tables/aws_tagging_resource.md Updated documentation to describe resource type filters.
aws/table_aws_tagging_resource.go Added key column support, JSON parsing, and error handling for resource_types filter.
Comments suppressed due to low confidence (1)

aws/table_aws_tagging_resource.go:127

  • [nitpick] Consider renaming 'resource_types' to 'rawResourceTypes' to better distinguish it from the parsed slice 'resourceTypes' and to align with Go naming conventions.
resource_types := d.EqualsQuals["resource_types"].GetJsonbValue()

@thomasklemm
Copy link
Contributor Author

@ParthaI There's two length constraints mentioned in the API docs: 100 array items, and 256 characters in total for the string that gets sent to the API. Wondering if we should handle that here and raise an error to the user? If I see correctly AWS just starts to ignore resource types after the character limit (but need to verify this better)

image

@ParthaI
Copy link
Contributor

ParthaI commented May 8, 2025

Hello @thomasklemm, thank you for sharing the detailed information.

Approach 1: We can pass the resourceTypes input exactly as provided by the user in the query. The API will then handle it according to its default behavior. If the input exceeds the allowed limits and the API returns an error, we simply propagate that error back to the user—this helps them understand the limit and how the API behaves.

Approach 2: Alternatively, if the resourceTypes array exceeds the documented limits (more than 256 characters or more than 100 items), we can split it into smaller chunks and make multiple API calls accordingly. This ensures that no data is missed, even if the user requests a large number of resource types.

In my opinion, I’d prefer Approach 1, as it aligns with the default behavior of the AWS CLI.

Please let me know your thoughts. Thanks!

@thomasklemm
Copy link
Contributor Author

thomasklemm commented May 8, 2025

@ParthaI I think the API in this case in not returning an error in either case (array > 100 entries, complete string > 256 characters) based on what I observed earlier, it will just silently drop the additional items. The string limit is actually quite easy to it if you're querying for more than 20-25 services at the same time, w/ complete resource types much less is actually usable. I think it might make sense to raise an error in the code to allow the user to adjust their query, or even better do the chunking you describe in approach 2. Is there another place in the AWS plugin where this strategy is being used?

Another thing I just noticed: I think the caching isn't working in the case where resource_types is provided right now, do you have an intuition why this might be happening? Based on query times it always fetching the data, not returning any cached data.

@ParthaI
Copy link
Contributor

ParthaI commented May 12, 2025

@ParthaI I think the API in this case in not returning an error in either case (array > 100 entries, complete string > 256 characters) based on what I observed earlier, it will just silently drop the additional items. The string limit is actually quite easy to it if you're querying for more than 20-25 services at the same time, w/ complete resource types much less is actually usable. I think it might make sense to raise an error in the code to allow the user to adjust their query, or even better do the chunking you describe in approach 2. Is there another place in the AWS plugin where this strategy is being used?

We have implemented a similar pattern (though not exactly the same) in the aws_codecommit_repository table, where we list resources using a batch process.

Another thing I just noticed: I think the caching isn't working in the case where resource_types is provided right now, do you have an intuition why this might be happening? Based on query times it always fetching the data, not returning any cached data.

Since we are using CacheMatch: query_cache.CacheMatchExact for the resource_types key column, the cache will only be used if the query parameter exactly matches; otherwise, it will result in a cache miss. And I think this is expected.

@ParthaI
Copy link
Contributor

ParthaI commented May 19, 2025

Hello @thomasklemm, did you get a chance to take a look at the above comment?

@thomasklemm thomasklemm force-pushed the feat/resource-tags-filters branch from 9779b47 to 9fdcdbf Compare May 24, 2025 20:12
…ource

The AWS Resource Groups Tagging API has a 256-character limit for the
ResourceTypeFilters parameter when resource types are comma-separated.
This change implements automatic batching to handle large lists of
resource types that would exceed this limit.

How the batching works:
- The splitResourceTypes function calculates the total length of resource
  types when joined with commas
- When adding a new resource type would exceed 256 characters, it starts
  a new batch
- Each batch is processed as a separate API call with its own pagination
- Results are deduplicated by ARN to prevent the same resource appearing
  multiple times when it matches multiple resource type filters

This follows a similar pattern to aws_codecommit_repository.go which
batches repository names for the BatchGetRepositories API that has a
25-item limit.

The implementation maintains full backward compatibility and transparency
- users can specify large lists of resource types without manual batching.
…acter limit

After testing, the actual limitation for ResourceTypeFilters in the AWS
Resource Groups Tagging API is 100 items per request, not 256 characters
when comma-separated as suggested in some documentation.

Changes:
- Renamed splitResourceTypes to batchResourceTypes for clarity
- Simplified batching logic to count items instead of character length
- Changed batch size from 256 characters to 100 items
- Removed complex string length calculations in favor of simple item counting

This makes the code much simpler and aligns with the actual API behavior
observed during testing.
@thomasklemm thomasklemm requested a review from Copilot May 25, 2025 18:41
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces filtering functionality by resource types for the aws_tagging_resource table. Key changes include:

  • Updating documentation to describe the new JSON array filter for resource types.
  • Adding a "resource_types" column and KeyColumns configuration to support filtering.
  • Implementing JSON parsing, batching of resource type qualifiers, and deduplication of results based on ARN.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
docs/tables/aws_tagging_resource.md Added documentation for filtering resources by resource types.
aws/table_aws_tagging_resource.go Implemented parsing, batching, and deduplication for the new filter.
Comments suppressed due to low confidence (2)

aws/table_aws_tagging_resource.go:122

  • [nitpick] Consider renaming 'resource_types' to 'resourceTypesJSON' to follow Go's camelCase naming conventions and to clearly differentiate the qualifier value from other variables.
resource_types := d.EqualsQuals["resource_types"].GetJsonbValue()

aws/table_aws_tagging_resource.go:222

  • [nitpick] Consider renaming 'currentItems' to 'batchCount' to improve clarity and align with idiomatic Go naming practices.
currentItems := 0

@thomasklemm thomasklemm requested a review from Copilot May 25, 2025 19:50
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for filtering the aws_tagging_resource table by AWS service or specific resource types.

  • Introduce a new resource_types qualifier, parsing JSON arrays of strings.
  • Implement batching, pagination, and deduplication in listTaggingResources.
  • Extend documentation with examples and a new column.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
docs/tables/aws_tagging_resource.md Added a "Filter Resources by Resource Types" section with usage examples and a table of common types.
aws/table_aws_tagging_resource.go Implemented parseResourceTypesFilter, batching (createResourceTypeBatches), and processing logic.
Comments suppressed due to low confidence (2)

aws/table_aws_tagging_resource.go:155

  • [nitpick] Expand the example formats in the error message to include both service-only and service:resourceType patterns (e.g., ["ec2", "ec2:instance"]) for clearer guidance.
errors.New("failed to parse 'resource_types' qualifier: value must be a JSON array of strings, e.g. [\"ec2:instance\", \"s3:bucket\", \"rds\"]")

aws/table_aws_tagging_resource.go:121

  • Add unit tests for parseResourceTypesFilter, createResourceTypeBatches, and processResourceBatch to verify behavior with empty, multiple, and invalid resource_types inputs.
resourceTypes, err := parseResourceTypesFilter(d)

@thomasklemm
Copy link
Contributor Author

@ParthaI I have now confirmed that the 256 character limit that the API docs mention doesn't exist in reality, not sure why it made it to the docs. However the 100 items limit exists, so I adjusted the implementation to do automatic batching if more than 100 services/resource types get provided to the resource types filter. Locally it's working very well, but I'd like to confirm it in our production environment too w/ access to larger AWS organizations, so see if there's any issues.
Will report back and then craft nicer commits :)

Without the batching, this error would get returned: operation error Resource Groups Tagging API: GetResources, https response error StatusCode: 400, RequestID: ca55ac0c-6b4a-49f7-98b9-4334a0f8f8b2, InvalidParameterException: ResourceTypeFilters provided are more than allowed limit 100

@ParthaI
Copy link
Contributor

ParthaI commented May 28, 2025

Thanks, @thomasklemm , for diving deeper into this. The implementation looks good. Please let me know once you've pushed the changes to this PR, and I’ll do a final review.

Thanks again for all your efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants