Name	Name	Last commit message	Last commit date
parent directory ..
.claude-plugin	.claude-plugin
.codex-plugin	.codex-plugin
skills	skills
.mcp.json	.mcp.json
README.md	README.md

aws-data-analytics

Overview

This plugin brings AWS data engineering expertise directly into your coding assistant, covering the full data lifecycle across AWS Analytics services; currently, skills are provided to assist with the following capability areas:

Data Lake Operations — Build and operate a data lake on AWS: create managed Iceberg tables on Amazon S3 Tables, ingest data from diverse sources (S3, JDBC databases, Snowflake, BigQuery, DynamoDB, AWS Glue catalog tables), and query across default and federated catalogs with Amazon Athena.
Data Discovery — Inventory and audit your AWS Glue Data Catalog across S3 Tables, Amazon Redshift-federated, and remote Iceberg catalogs. Resolve data asset references by name, keyword, column, or reverse-lookup from S3 location metadata in the catalog.
Vector Storage — Store and query vector embeddings using Amazon S3 Vectors for cost-effective semantic search and RAG workloads.
External Connectivity — Create and troubleshoot AWS Glue connections to JDBC databases (Oracle, SQL Server, PostgreSQL, MySQL, RDS, Aurora), Amazon Redshift, Snowflake, and BigQuery.

Agent Skills

#	Skill	Description	Documentation
1	`creating-data-lake-table`	Create managed Iceberg tables using Amazon S3 Tables with automatic compaction, AWS Glue catalog registration, and partitioning	SKILL.md
2	`ingesting-into-data-lake`	Import data from S3 files, JDBC databases, Snowflake, BigQuery, DynamoDB, or existing AWS Glue catalog tables into S3 Tables or standard Iceberg	SKILL.md
3	`querying-data-lake`	Execute and manage Athena SQL queries across default and federated catalogs (AWS Glue, S3 Tables, Amazon Redshift)	SKILL.md
4	`finding-data-lake-assets`	Resolve data lake asset references across AWS Glue Data Catalog, S3, S3 Tables, and Amazon Redshift by name, keyword, column, or S3 path	SKILL.md
5	`exploring-data-catalog`	Full inventory and audit of AWS Glue Data Catalog assets across S3 Tables, Amazon Redshift-federated, and remote Iceberg catalogs	SKILL.md
6	`storing-and-querying-vectors`	Store and query vector embeddings using Amazon S3 Vectors for semantic search and RAG workloads	SKILL.md
7	`connecting-to-data-source`	Create and troubleshoot AWS Glue connections to JDBC databases, Amazon Redshift, Snowflake, and BigQuery	SKILL.md

MCP Servers

#	Server	Description
1	`aws-mcp`	AWS API access, documentation search, and SOP retrieval via AWS MCP Server

Installation

See Quick Start.

Data Lake Operations

The data lake skills cover the jobs-to-be-done for building and operating a data lake on AWS. They follow AWS best practices as agent-readable instruction packages, guiding you from table creation through ingestion and querying.

How It Works

Create tables — The creating-data-lake-table skill sets up managed Iceberg tables on Amazon S3 Tables with automatic compaction, snapshot management, AWS Glue catalog registration, partitioning, and IAM access control.
Ingest data — The ingesting-into-data-lake skill moves data from local files, S3, JDBC databases (Oracle, SQL Server, PostgreSQL, MySQL, RDS, Aurora, Amazon Redshift), Snowflake, BigQuery, DynamoDB, or existing AWS Glue catalog tables into your data lake. Supports one-time loads, recurring pipelines, and migrations.
Query data — The querying-data-lake skill executes Athena SQL queries across default and federated catalogs, with workgroup selection, statement classification, cost tracking, and error recovery.

Examples

"Create an Iceberg table for our order events with daily partitioning"
"Import our PostgreSQL sales data into the data lake"
"Query the top 10 customers by revenue from our analytics table"
"Migrate our existing Hive tables to Iceberg on S3 Tables"

Data Discovery

The discovery skills help you understand what data exists in your AWS account and find specific assets quickly.

exploring-data-catalog — Full inventory and audit across AWS Glue Data Catalog, S3 Tables, Amazon Redshift-federated, and remote Iceberg catalogs. Maps your data landscape, flags stale tables, and suggests improvements.
finding-data-lake-assets — Resolves fuzzy data references ("our orders table", "the sales dataset") to concrete catalog entries using layered search across AWS Glue, S3, S3 Tables, and Amazon Redshift.

Examples

"What data do we have in our account?"
"Inventory all catalogs and databases"
"Find the table that has customer_id"
"Where is our quarterly revenue data?"

Vector Storage

The storing-and-querying-vectors skill provides cost-effective vector embedding storage and retrieval using Amazon S3 Vectors, optimized for long-term storage with subsecond query latency.

Examples

"Create a vector index for our product embeddings"
"Store these document embeddings for RAG"
"Find the most similar items to this query vector"

External Connectivity

The connecting-to-data-source skill creates and troubleshoots AWS Glue connections to external databases. It discovers existing connections and candidate sources in your account, registers credentials securely via Secrets Manager or IAM DB auth, configures VPC networking, and tests end-to-end connectivity.

Examples

"Connect to our Oracle production database"
"Set up an AWS Glue connection to Snowflake"
"Test my existing BigQuery connection"
"Troubleshoot the connection timeout on my RDS connection"

Supported Environments

Using the plugin in your local compute

In your local environment, configure AWS credentials and set your target region to get started.

Prerequisites

An AWS account with access to AWS Analytics services (AWS Glue, Athena, S3 Tables, S3 Vectors)
Local AWS credentials and config
uv (for MCP server)

Authentication and Authorization

Configure AWS credentials using one of the following methods:

AWS CLI — Run aws configure (IAM credentials) or aws sso login (IAM Identity Center)
Environment variables — Set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN. See Configuring environment variables for details.

Your IAM role needs permissions for the AWS services used by the skills you install. The relevant IAM action namespaces are:

athena - Query execution and workgroup management
glue - Data Catalog operations and ETL jobs
s3 - Object storage operations
s3tables - Managed Iceberg table operations (separate from s3)
s3vectors - Vector storage operations (separate from s3)

Scope permissions to the resources your workload uses.

Configuration

Set AWS_DEFAULT_REGION to your preferred AWS region (e.g., us-east-1). See Configuring environment variables for details.

Customizing Skills for Your Organization

The skills in this plugin follow AWS best practices, but they are fully customizable. You can fork the repository and modify any SKILL.md to reflect your organization's standards, naming conventions, approved data formats, or internal tooling. Workspace-level skills take precedence over global skills, so teams can maintain their own versions without affecting other users.

Related Resources

License

This project is licensed under the Apache 2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

aws-data-analytics

Overview

Agent Skills

MCP Servers

Installation

Data Lake Operations

How It Works

Examples

Data Discovery

Examples

Vector Storage

Examples

External Connectivity

Examples

Supported Environments

Using the plugin in your local compute

Prerequisites

Authentication and Authorization

Configuration

Customizing Skills for Your Organization

Related Resources

License

FilesExpand file tree

aws-data-analytics

Directory actions

More options

Directory actions

More options

Latest commit

History

aws-data-analytics

Folders and files

parent directory

README.md

aws-data-analytics

Overview

Agent Skills

MCP Servers

Installation

Data Lake Operations

How It Works

Examples

Data Discovery

Examples

Vector Storage

Examples

External Connectivity

Examples

Supported Environments

Using the plugin in your local compute

Prerequisites

Authentication and Authorization

Configuration

Customizing Skills for Your Organization

Related Resources

License