AWS Glue

AWS Glue Pipeline Observability

Full visibility into every Glue job: run history, DPU usage, failure root causes, execution timing, and catalog health metrics.

Glue failures are silent by default

AWS Glue doesn't have built-in alerting for job failures, duration anomalies, or DPU overruns. CloudWatch alarms require manual setup per job. Without observability, your pipelines fail at 2am and nobody knows until an analyst reports a broken dashboard.

Job Run History & Failure Detection

Track every Glue job execution: start time, duration, exit status, and error message. Failures trigger immediate alerts to your team.

DPU Consumption Tracking

Monitor DPU-hours per job over time. Identify jobs that are using more capacity than expected and costing more than budgeted.

Duration Anomaly Detection

Baseline normal execution time per job. Get alerted when a job runs significantly longer than its historical average , a common early sign of data skew or upstream issues.

Catalog Table Health

Monitor Glue Data Catalog for stale table definitions, broken partition projections, and schema mismatches between catalog and actual data.

Job Dependency Mapping

Visualize which jobs depend on which tables and S3 paths. Understand blast radius when an upstream job fails.

Retry & Failure Patterns

Identify jobs with recurring failures or high retry rates. Surface the most unstable jobs in your pipeline fleet.

How to connect Glue

1

Deploy read-only IAM role

CloudFormation creates scoped read access to Glue Job APIs, Glue Crawlers, and the Glue Data Catalog.

2

Select jobs and crawlers to monitor

Register the Glue jobs and crawlers you want to observe. Supports wildcards for monitoring all jobs in a namespace.

3

Configure alert policies

Define failure alert routing, duration thresholds, and DPU budget limits per job or job group.

4

Pipeline dashboards live within 24h

Job run timelines, DPU usage charts, failure rate trends, and dependency maps are available immediately.

Required IAM permissions

glue:GetJob
glue:GetJobRun
glue:GetJobRuns
glue:ListJobs
glue:GetCrawler
glue:GetDatabase
glue:GetTable
glue:GetTableVersions

All permissions are read-only. No write, start, or stop actions included.

See exactly what's happening in your S3 data layer

Works with your existing AWS setup. Read-only access. No agents. No data exposure.

Book a Demo