aws · arnewouters · May 6, 2026 · May 6, 2026 · May 6, 2026 · May 6, 2026
@@ -1,14 +1,15 @@
 ---
-name: connect-to-data-source
-description: 'Create and troubleshoot AWS Glue connections to JDBC databases (Oracle,
-  SQL Server, PostgreSQL, MySQL, RDS), Redshift, Snowflake, and BigQuery. Gathers
-  connection hints from user, discovers existing connections and RDS/Redshift candidates,
-  registers credentials in Secrets Manager or IAM DB auth, configures VPC, and tests.
-  Triggers on: connect to database, set up Glue connection, register data source,
-  connect to Snowflake/BigQuery/RDS, connection timeout, test connection, troubleshoot
-  connection. Do NOT use for moving data (use ingest-into-data-lake), creating tables
-  (use create-data-lake-table), queries (use query-data-lake), catalog exploration
-  (use exploring-data-catalog), or SaaS (Salesforce, ServiceNow, SAP, MongoDB, Kafka).'
+name: connecting-to-data-source
+description: >-
+  Create and troubleshoot AWS Glue connections to JDBC databases (Oracle, SQL Server,
+  PostgreSQL, MySQL, RDS), Redshift, Snowflake, and BigQuery. Gathers connection hints
+  from user, discovers existing connections and RDS/Redshift candidates, registers
+  credentials in Secrets Manager or IAM DB auth, configures VPC, and tests. Triggers
+  on: connect to database, set up Glue connection, register data source, connect to
+  Snowflake/BigQuery/RDS, connection timeout, test connection, troubleshoot connection.
+  Do NOT use for moving data (use ingesting-into-data-lake), creating tables (use
+  creating-data-lake-table), queries (use querying-data-lake), catalog exploration
+  (use exploring-data-catalog), or SaaS (Salesforce, ServiceNow, SAP, MongoDB, Kafka).
 version: 1
 metadata:
   service: [glue, secretsmanager, rds, redshift]
@@ -20,7 +21,7 @@ argument-hint: '[source-type|connection-name|hostname]'
 
 # Connect to Data Source
 
-Register an external data source with AWS Glue so downstream skills (ingest-into-data-lake) can move data from it. A Glue connection stores the network config, driver, and credential reference for one source. Create once per source, reuse across jobs.
+Register an external data source with AWS Glue so downstream skills (ingesting-into-data-lake) can move data from it. A Glue connection stores the network config, driver, and credential reference for one source. Create once per source, reuse across jobs.
 
 ## Philosophy
 
@@ -48,7 +49,7 @@ Ask the user which source type they want to connect to, or infer from hints:
 | "Snowflake" | Snowflake | `SNOWFLAKE` | [snowflake-setup.md](references/snowflake-setup.md) |
 | "BigQuery", "Google analytics warehouse" | BigQuery | `BIGQUERY` | [bigquery-setup.md](references/bigquery-setup.md) |
 
-If the user names DynamoDB or a local file, stop and tell them: DynamoDB is read directly by Glue without a connection, and local files belong in the ingest-into-data-lake skill's local-upload workflow.
+If the user names DynamoDB or a local file, stop and tell them: DynamoDB is read directly by Glue without a connection, and local files belong in the ingesting-into-data-lake skill's local-upload workflow.
 
 ### 3. Gather Connection Hints from the User
 
@@ -126,7 +127,7 @@ After TestConnection passes, verify the connection works with the user's intende
 
 Phase B catches issues that TestConnection misses: driver compatibility at job runtime, catalog configuration, Spark-level serialization, and engine-specific auth flows (e.g., Snowflake SNOWFLAKE type works in ETL but not via JDBC crawlers).
 
-On success in both phases, tell user the connection name is ready for `ingest-into-data-lake`. On failure in either phase, Step 8.
+On success in both phases, tell user the connection name is ready for `ingesting-into-data-lake`. On failure in either phase, Step 8.
 
 ### 8. Troubleshoot (only if test failed)
 

@@ -1,21 +1,21 @@
 ---
-name: creating-data-lake-tables
-description: >
-  Create managed Iceberg tables using Amazon S3 Tables (s3tables API namespace)
-  with automatic compaction and snapshot management. Sets up table bucket,
-  namespace, table, schema, Glue catalog registration, partitioning, IAM access control.
-  Triggers on: create table, data lake table, analytics table, structured data storage,
-  S3 Tables, Iceberg, Athena table, partitioning strategy, access permissions. Do NOT use
-  for: importing files (use ingest-into-data-lake), vector storage (use store-and-query-vectors),
-  querying existing tables (use query-data-lake), or locating existing table
-  (use find-data-lake-assets).
-argument-hint: "[table-description|schema-spec]"
+name: creating-data-lake-table
+description: >-
+  Create managed Iceberg tables using Amazon S3 Tables (s3tables API namespace) with
+  automatic compaction and snapshot management. Sets up table bucket, namespace, table,
+  schema, Glue catalog registration, partitioning, IAM access control. Triggers on:
+  create table, data lake table, analytics table, structured data storage, S3 Tables,
+  Iceberg, Athena table, partitioning strategy, access permissions. Do NOT use  for:
+  importing files (use ingesting-into-data-lake), vector storage (use storing-and-querying-vectors),
+  querying existing tables (use querying-data-lake), or locating existing table (use
+  finding-data-lake-assets).
 version: 1
 metadata:
   service: [s3tables, glue, athena]
   task: [deploy, debug]
   persona: [developer, data-engineer]
   workload: [data-analytics]
+argument-hint: '[table-description|schema-spec]'
 ---
 
 # Create Data Lake Tables with Amazon S3 Tables
@@ -36,15 +36,15 @@ You MUST run `aws glue get-tables --database-name <NAME>` when user mentions a d
 
 | What you find | Action |
 |---------------|--------|
-| Fuzzy database name ("our analytics db") | You MUST STOP. Delegate to `find-data-lake-assets` to resolve. |
-| Non-S3-Tables table with matching name | You MUST STOP. Delegate to `find-data-lake-assets`. You MUST NOT create until user confirms. |
+| Fuzzy database name ("our analytics db") | You MUST STOP. Delegate to `finding-data-lake-assets` to resolve. |
+| Non-S3-Tables table with matching name | You MUST STOP. Delegate to `finding-data-lake-assets`. You MUST NOT create until user confirms. |
 | Existing S3 Tables table with matching name | You MUST check schema match. Reuse if compatible, recreate only if user confirms. |
 | No matching tables | Proceed with creation (Steps 1-8). |
 | User explicitly requests new S3 Tables table | Skip checks, proceed with creation. |
 
 **Creation paths:**
 
-- **Existing data in S3**: Create empty table (Steps 1-8), then use `ingest-into-data-lake` skill.
+- **Existing data in S3**: Create empty table (Steps 1-8), then use `ingesting-into-data-lake` skill.
 - **Glue ETL pipeline**: Read `references/table-creation-glue-etl.md` first, then Steps 1-6.
 - **Lake Formation access control**: Search AWS docs for `"S3 Tables integration with Lake Formation"`.
 
@@ -59,7 +59,7 @@ You MUST run `aws glue get-tables --database-name <NAME>` when user mentions a d
 
 - **Explicit schema**: Validate Iceberg types.
 - **Loose description**: Ask columns, types, grain. Propose and confirm.
-- **Existing S3 data**: Infer schema from file headers only. Create empty table first, then use `ingest-into-data-lake` skill.
+- **Existing S3 data**: Infer schema from file headers only. Create empty table first, then use `ingesting-into-data-lake` skill.
 
 **Constraints:**
 
@@ -195,4 +195,4 @@ You MUST verify with `aws s3tables get-table` and confirm queryability with `DES
 - [best-practices.md](references/best-practices.md) -- Iceberg types, partitions, naming, common errors
 - [athena-ddl-path.md](references/athena-ddl-path.md) -- Athena DDL, schema evolution
 - [table-creation-glue-etl.md](references/table-creation-glue-etl.md) -- Spark DDL via Glue ETL
-- Loading data: `ingest-into-data-lake` skill
+- Loading data: `ingesting-into-data-lake` skill
@@ -1,10 +1,11 @@
 ---
 name: exploring-data-catalog
-description: 'Full inventory and audit of AWS Glue Data Catalog assets across S3 Tables,
-  Redshift-federated, and remote Iceberg catalogs. Triggers on: inventory the catalog,
-  audit databases, list all tables, catalog overview, data landscape, enumerate catalogs,
-  data inventory, search the catalog. Do NOT use for finding specific data (use find-data-lake-assets),
-  running queries (use query-data-lake), or creating tables (use create-data-lake-table).'
+description: >-
+  Full inventory and audit of AWS Glue Data Catalog assets across S3 Tables, Redshift-federated,
+  and remote Iceberg catalogs. Triggers on: inventory the catalog, audit databases,
+  list all tables, catalog overview, data landscape, enumerate catalogs, data inventory,
+  search the catalog. Do NOT use for finding specific data (use finding-data-lake-assets),
+  running queries (use querying-data-lake), or creating tables (use creating-data-lake-table).
 version: 1
 metadata:
   service: [glue, s3, s3tables]
@@ -115,7 +116,7 @@ Resolve the argument in this order; stop at the first match:
 - Flag stale tables and missing descriptions
 - Suggest partitioning for large unpartitioned tables
 - Summary first, details on request
-- You MUST NOT execute Athena queries (`start-query-execution`) during discovery; query execution belongs to `query-data-lake`
+- You MUST NOT execute Athena queries (`start-query-execution`) during discovery; query execution belongs to `querying-data-lake`
 
 ## Troubleshooting
 

@@ -1,11 +1,12 @@
 ---
-name: find-data-lake-assets
-description: 'Resolve data lake and lakehouse asset references across Glue Data Catalog,
-  S3, S3 Tables, and Redshift. Triggers on: find the table, where is our data, which
-  table has, locate dataset, find data for, search catalog, what tables match, Redshift
+name: finding-data-lake-assets
+description: >-
+  Resolve data lake and lakehouse asset references across Glue Data Catalog, S3, S3
+  Tables, and Redshift. Triggers on: find the table, where is our data, which table
+  has, locate dataset, find data for, search catalog, what tables match, Redshift
   table, lakehouse table, data lake table, warehouse table, reverse lookup S3 path.
   Do NOT use for: full catalog audits (use exploring-data-catalog), running queries
-  (use query-data-lake), creating tables (use create-data-lake-table).'
+  (use querying-data-lake), creating tables (use creating-data-lake-table).
 version: 1
 metadata:
   service: [glue, s3, s3tables, redshift]

@@ -1,17 +1,18 @@
 ---
-name: ingest-into-data-lake
-description: 'Import data into the AWS data lake from S3 files, local uploads, JDBC
-  databases (Oracle, SQL Server, PostgreSQL, MySQL, RDS, Aurora), Amazon Redshift,
-  Snowflake, BigQuery, DynamoDB, or existing Glue catalog tables (migration). Default
-  target is S3 Tables; standard Iceberg on a general purpose bucket is supported where
-  S3 Tables is not adopted. Handles one-time loads, recurring pipelines, migrations.
+name: ingesting-into-data-lake
+description: >-
+  Import data into the AWS data lake from S3 files, local uploads, JDBC databases
+  (Oracle, SQL Server, PostgreSQL, MySQL, RDS, Aurora), Amazon Redshift, Snowflake,
+  BigQuery, DynamoDB, or existing Glue catalog tables (migration). Default target
+  is S3 Tables; standard Iceberg on a general purpose bucket is supported where S3
+  Tables is not adopted. Handles one-time loads, recurring pipelines, migrations.
   Triggers on: import data, load data, ingest, sync database, migrate table, move
   data to AWS, set up pipeline, ETL, pull from Snowflake, query BigQuery into S3,
   export DynamoDB, CTAS, convert to Iceberg. Do NOT use for setting up or troubleshooting
-  Glue connections (use connect-to-data-source), creating empty tables (use create-data-lake-table),
-  running queries (use query-data-lake), finding tables by fuzzy name (use find-data-lake-assets),
+  Glue connections (use connecting-to-data-source), creating empty tables (use creating-data-lake-table),
+  running queries (use querying-data-lake), finding tables by fuzzy name (use finding-data-lake-assets),
   catalog audit (use exploring-data-catalog), or SaaS platforms like Salesforce, ServiceNow,
-  SAP, MongoDB, Kafka.'
+  SAP, MongoDB, Kafka.
 version: 1
 metadata:
   service: [glue, s3, s3tables, athena, dynamodb]
@@ -23,7 +24,7 @@ argument-hint: '[source-path|connection-name|table-name] [--target s3-tables|ice
 
 # Ingest into Data Lake
 
-Move data from a source into a queryable table in the data lake. This skill assumes the source connection (if one is needed) already exists. For Glue connection setup or troubleshooting, delegate to `connect-to-data-source`.
+Move data from a source into a queryable table in the data lake. This skill assumes the source connection (if one is needed) already exists. For Glue connection setup or troubleshooting, delegate to `connecting-to-data-source`.
 
 ## Philosophy
 
@@ -39,7 +40,7 @@ You MUST execute commands using AWS MCP server tools when connected -- they prov
 
 - You MUST check whether AWS MCP tools or AWS CLI are available and inform the user if missing
 - You MUST confirm target AWS region and verify credentials with `aws sts get-caller-identity`
-- For SageMaker Unified Studio project roles, note that target tables and connections may be scoped to the project. See the caller ARN detection pattern in `query-data-lake`.
+- For SageMaker Unified Studio project roles, note that target tables and connections may be scoped to the project. See the caller ARN detection pattern in `querying-data-lake`.
 
 ### 2. Classify the Source
 
@@ -55,7 +56,7 @@ You MUST execute commands using AWS MCP server tools when connected -- they prov
 
 If the user names Salesforce, ServiceNow, SAP, MongoDB, Kafka, or another SaaS/streaming source, decline -- these are not supported in this release.
 
-If the source table is referenced by a fuzzy or business name ("migrate our orders table", "pull from the sales warehouse"), delegate to `find-data-lake-assets` to resolve before proceeding.
+If the source table is referenced by a fuzzy or business name ("migrate our orders table", "pull from the sales warehouse"), delegate to `finding-data-lake-assets` to resolve before proceeding.
 
 ### 3. Confirm Connection Exists (if applicable)
 
@@ -65,7 +66,7 @@ For JDBC, Snowflake, and BigQuery sources, a Glue connection is required. Check:
 aws glue get-connection --name <CONNECTION_NAME> --region <REGION>
 ```
 
-If the connection does not exist, stop and delegate to `connect-to-data-source` to create and test it. Do not proceed with ingest until the connection is verified.
+If the connection does not exist, stop and delegate to `connecting-to-data-source` to create and test it. Do not proceed with ingest until the connection is verified.
 
 Local files, S3 files, DynamoDB, and catalog migration do not need a Glue connection.
 
@@ -74,7 +75,7 @@ Local files, S3 files, DynamoDB, and catalog migration do not need a Glue connec
 You MUST ask the user (or suggest based on catalog inventory) before creating or writing to any table:
 
 - **Database/namespace**: Does a specific target database exist? Or should one be created?
-- **Table**: Existing table (append/merge) or new table (delegate to `create-data-lake-table`)?
+- **Table**: Existing table (append/merge) or new table (delegate to `creating-data-lake-table`)?
 - **Format**: S3 Tables (default), standard Iceberg, or raw Parquet?
 
 **Inventory-aware defaults:**
@@ -89,8 +90,8 @@ Do not force S3 Tables on customers who haven't adopted it. See [iceberg-catalog
 
 **Delegations from this step:**
 
-- Target table doesn't exist -> `create-data-lake-table`
-- Target database named by fuzzy term -> `find-data-lake-assets`
+- Target table doesn't exist -> `creating-data-lake-table`
+- Target database named by fuzzy term -> `finding-data-lake-assets`
 - User doesn't know what exists -> `exploring-data-catalog`
 
 ### 5. Execute Source Workflow
@@ -132,7 +133,7 @@ For recurring pipelines, create a Glue Trigger with a cron schedule. See [testin
 - `overwritePartitions()` only replaces partitions present in the DataFrame -- for full refresh with deletes, use `createOrReplace()`
 - Standard Iceberg targets MUST include a LOCATION clause; S3 Tables MUST NOT
 - DynamoDB does not need a Glue connection -- do not attempt to create one
-- Connection failures during ingest delegate back to `connect-to-data-source`; do not debug network/credentials in this skill
+- Connection failures during ingest delegate back to `connecting-to-data-source`; do not debug network/credentials in this skill
 - For target tables in SageMaker Unified Studio projects, ensure the project role has write access to the target namespace before the Glue job runs
 
 ## Troubleshooting
@@ -142,7 +143,7 @@ For recurring pipelines, create a Glue Trigger with a cron schedule. See [testin
 | Access Denied on S3 | Missing IAM permissions | Check Glue role has s3:GetObject, s3:PutObject |
 | Access Denied on S3 Tables | Missing s3tables:* permissions | Add S3 Tables inline policy to Glue role |
 | CTAS timeout | Dataset too large for Athena | Switch to Glue ETL or batch with WHERE filters |
-| JDBC connection timeout/auth failure | Connection-level issue | Delegate to `connect-to-data-source` |
+| JDBC connection timeout/auth failure | Connection-level issue | Delegate to `connecting-to-data-source` |
 | Throughput exceeded (DynamoDB) | Read percent too high | Lower `read.percent` or use native export |
 
 See [error-handling.md](references/error-handling.md) for the full catalog.
@@ -171,7 +172,7 @@ See [error-handling.md](references/error-handling.md) for the full catalog.
 - [type-transformations.md](references/type-transformations.md) -- Type conflict resolution
 - [format-specific-loading.md](references/format-specific-loading.md) -- CSV/JSON/Parquet/Avro/ORC specifics
 - [athena-loading.md](references/athena-loading.md) -- Athena INSERT INTO as simple-load fallback
-- [error-handling.md](references/error-handling.md) -- Ingest errors (connection errors delegate to connect-to-data-source)
+- [error-handling.md](references/error-handling.md) -- Ingest errors (connection errors delegate to connecting-to-data-source)
 - [upload-options.md](references/upload-options.md) -- aws s3 cp vs sync, multipart
 
 ### Migration-specific

@@ -1,6 +1,6 @@
 # BigQuery Ingest
 
-Move data from Google BigQuery into the data lake. Assumes a Glue `BIGQUERY` connection exists. If not, delegate to `connect-to-data-source`.
+Move data from Google BigQuery into the data lake. Assumes a Glue `BIGQUERY` connection exists. If not, delegate to `connecting-to-data-source`.
 
 ## Contents