@@ -12,6 +12,7 @@ Creates a complete scheduled setup:
12
12
- Storage bucket with lifecycle management
13
13
- Secret Manager IAM bindings
14
14
- Source code change detection
15
+ - ** Slack alerting** for job failures (optional)
15
16
16
17
## Quick Start
17
18
@@ -85,6 +86,10 @@ module "my_data_processor" {
85
86
version = "latest"
86
87
}
87
88
]
89
+
90
+ # Enable Slack alerting for job failures (enabled by default)
91
+ slack_api_token = "xoxb-your-slack-api-token"
92
+ slack_channel = "#1s-and-0s"
88
93
}
89
94
```
90
95
@@ -241,6 +246,13 @@ module "data_processor" {
241
246
- ` job_args ` - Command arguments ([ ] )
242
247
- ` job_image ` - Container image URL (required)
243
248
249
+ ### Alerting (optional)
250
+ - ` enable_alerting ` - Whether to enable alerting for job failures (true)
251
+ - ` slack_webhook_url ` - Slack webhook URL for sending failure notifications (null)
252
+ - ` slack_channel ` - Slack channel to send notifications to (e.g., "#1s-and-0s") ("#1s-and-0s")
253
+ - ` alert_project_id ` - GCP project ID where monitoring and alerting resources will be created (defaults to project_id) (null)
254
+ - ` notification_email ` - Email address for additional failure notifications (null)
255
+
244
256
## Outputs
245
257
246
258
- ` resource_name ` - Name of deployed function or job
@@ -251,6 +263,10 @@ module "data_processor" {
251
263
- ` storage_bucket_name ` - Storage bucket name
252
264
- ` execution_type ` - The execution type used
253
265
266
+ ### Alerting Outputs (when ` enable_alerting = true ` )
267
+ - ` monitoring_notification_channel_name ` - Name of the monitoring notification channel
268
+ - ` alert_policy_names ` - Names of the monitoring alert policies
269
+
254
270
## Repository Structure
255
271
256
272
```
@@ -389,6 +405,65 @@ Or use Cloud Build directly:
389
405
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/YOUR_JOB_NAME:latest ./jobs/your-job
390
406
```
391
407
408
+ ## Alerting
409
+
410
+ The module supports optional Slack alerting for job failures. When enabled, it creates:
411
+
412
+ - ** Monitoring policies** : Cloud Monitoring alert policies for different failure scenarios
413
+ - ** Slack notification channel** : Direct integration with Slack using webhooks
414
+
415
+ ### Enabling Alerting
416
+
417
+ ``` hcl
418
+ module "my_job_with_alerts" {
419
+ source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/scheduled-job?ref=v1.0.0"
420
+
421
+ # ... other configuration ...
422
+
423
+ # Alerting is enabled by default, just provide the webhook URL
424
+ slack_webhook_url = "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
425
+ slack_channel = "#1s-and-0s" # Default channel
426
+
427
+ # Optional: Use different project for alerting resources
428
+ alert_project_id = "my-monitoring-project"
429
+ }
430
+ ```
431
+
432
+ ### What Gets Monitored
433
+
434
+ When alerting is enabled, the module creates monitoring policies for:
435
+
436
+ 1 . ** Cloud Function failures** (when ` execution_type = "function" ` )
437
+ - Function execution failures
438
+ - Error rates and timeouts
439
+
440
+ 2 . ** Cloud Run Job failures** (when ` execution_type = "job" ` )
441
+ - Job execution failures
442
+ - Task completion failures
443
+ - Timeout violations
444
+
445
+ 3 . ** Cloud Scheduler failures**
446
+ - Scheduler job execution failures
447
+ - Missed scheduled runs
448
+
449
+ 4 . ** Excessive retries**
450
+ - Jobs that have been retrying more than 3 times in 10 minutes
451
+ - Indicates persistent issues that need investigation
452
+
453
+ ### Slack Message Format
454
+
455
+ The Slack notifications include:
456
+ - Job name and status
457
+ - Resource information
458
+ - Failure condition details
459
+ - Timestamp and incident ID
460
+ - Color-coded messages (red for failures, green for recovery)
461
+
462
+ ### Security
463
+
464
+ - Slack webhook URL is stored securely in the monitoring notification channel
465
+ - All alerting resources are created in the specified project (or same project as the job)
466
+
392
467
## Common Cron Patterns
393
468
394
469
| Schedule | Description |
0 commit comments