CASE STUDIES

$9,470+ monthly waste found.
Under 5 minutes to first finding.

How data lake and platform engineering teams use reCost for S3 monitoring, surfacing Delta Lake issues, pipeline failures, and security gaps their existing tools missed.

5
case_studies
$9,470+
monthly_waste_found
< 5 min
time_to_first_finding
FEATURED
Data Lake HealthFintech · Data Lake Engineering

15.6 TB of Orphaned Files in S3, Invisible for 8 Months Without Object-Level Monitoring

42,015
orphaned snapshots
15.6 TB
recovered
8 months
undetected

A fintech company running Apache Iceberg on S3 had enforced no expiry policy across their 134 production tables. Snapshot accumulation had been silently compounding for over eight months , entirely invisible to their existing cost tooling, which only reported at the bucket level.

WHAT RECOST FOUND
  • 42,015 orphaned snapshots across 134 Iceberg tables
  • 15.6 TB of recoverable storage in STANDARD class
  • Expiry policy had never been configured on any table
  • Manifest bloat adding ~18% overhead to every Athena query
"We had no idea this was happening. Our cost dashboards showed S3 spend going up but nothing told us why. reCost pointed directly at the tables."
Staff Data Engineer
15.6 TB recovered · $1,870/mo saved · detected in < 5 minutes after connect
ALL CASE STUDIES
Pipeline ObservabilityMedia & Streaming · Platform Engineering

4 Dead ETL Pipelines Caught by S3 Write Pattern Monitoring, No Instrumentation Needed

4
silent pipelines
23 days
longest gap
0
instrumentation needed

A media company's platform team had no visibility into whether their S3-backed ingestion pipelines were actually writing data. An upstream schema change had silently broken four pipelines , two Glue jobs and two custom ETL processes , with no alerts firing because none were instrumented.

WHAT RECOST FOUND
  • 4 pipelines with no S3 writes in 7-23 days
  • Write cadence deviation detected from access log patterns
  • 2 AWS Glue jobs, 1 Spark streaming job, 1 custom ETL affected
  • Downstream dashboards had been serving stale data for 3 weeks
"We found out our reporting dashboards were running on 3-week-old data. No alarm had fired. reCost saw it from write patterns alone."
Principal Platform Engineer
Caught before SLA breach · 3-week stale data issue surfaced · zero instrumentation required
IAM & SecurityEnterprise SaaS · Security & Infrastructure

EOL SDK With Active CVE Detected via IAM Monitoring, 214 Days Undetected by Existing Tools

214 days
CVE exposure
520K
req/mo at risk
48h
to remediate

An enterprise SaaS company had a large, distributed engineering org with dozens of IAM roles accessing S3. A compliance audit found nothing , but reCost surfaced three roles still running boto3 1.9.x with a known CVE, making 520K requests/month against production data.

WHAT RECOST FOUND
  • 3 IAM roles using boto3 1.9.x , EOL for 2+ years
  • Active CVE-2018-15869 present (botocore < 1.12.63), never patched
  • 520K requests/month against prod-data-lake bucket
  • First-time access from a browser session on a prod prefix
"Our compliance tooling scans config. reCost watches actual behavior. That's a completely different signal and it caught something we'd missed for over 200 days."
Head of Infrastructure Security
CVE exposure eliminated · IAM role remediation in 48h · browser session access blocked
Query Engine MonitoringML Infrastructure · ML Platform

4.2× Athena Query Overhead From Cold Partitions, Found by S3 Access Log Analysis

4.2×
unnecessary scan overhead
28,400
queries affected/mo
$3,200
monthly savings

An ML infrastructure team was seeing Athena query costs creep up month over month with no obvious cause. Their tables had grown substantially, but partition hygiene had never been enforced , 214 cold partitions were still being scanned on every query even though the underlying data hadn't been accessed in months.

WHAT RECOST FOUND
  • 214 cold partitions included in every full table scan
  • 3.1 TB of wasted query results cached but never reused
  • 28,400 Athena queries affected in a single month
  • Average scan overhead: 4.2× necessary bytes per query
"We knew something was off with our Athena costs but had no way to attribute it. reCost showed us the exact partitions driving the waste."
ML Platform Lead
$3,200/mo Athena cost reduction · query time cut by 60% · partition pruning implemented same week
Storage IntelligenceE-commerce · Data Infrastructure

186 TB in Wrong S3 Storage Class, Found by Object-Level S3 Monitoring

186 TB
in wrong storage class
$2,400
monthly waste
>90 days
since last access

An e-commerce data team had grown their S3 footprint significantly over three years with no automated tiering in place. Their cost tool showed total S3 spend but had no per-prefix temperature data. reCost revealed that 186 TB of data across three buckets hadn't been touched in over 90 days but was still paying STANDARD pricing.

WHAT RECOST FOUND
  • 186 TB in STANDARD storage class, last access >90 days
  • $2,400/mo in avoidable storage costs
  • 3 buckets affected , none had lifecycle rules configured
  • Auto-tiering had never been enabled on any prefix
"Three years of data and nobody had ever looked at access temperature by prefix. reCost gave us that view in minutes."
Data Infrastructure Manager
$2,400/mo eliminated · GLACIER transition configured same day · lifecycle rules applied to all 3 buckets

See what's hiding in your data layer

Connect in 5 minutes. No agents, no code changes. Just your S3 access logs and a clear picture of what's happening inside your data lake.

Book a Demo