DATA ENGINEERING

Your lakehouse, at object level

Delta Lake and Iceberg table health, Athena and Spark query patterns, pipeline behavior, and orphaned data - derived from the S3 access logs your lakehouse already produces. Agentless and read-only.

Book a Demo
WHAT RECOST SEES

Everything your engines do to your tables is in the logs

Delta Lake & Iceberg table health

Compaction lag, snapshot sprawl, and metadata churn are visible in how engines touch your table files. reCost derives table health from object-level access - for every table, continuously.

Small files & orphaned data

Small-file problems and orphaned snapshots quietly degrade query performance and inflate storage. reCost finds data that is written but never read, and file patterns that slow every scan.

Athena & Spark query patterns

See which tables are scanned hardest, which queries read far more than they return, and where partitioning or clustering changes would have the most impact.

Pipeline read/write behavior

Every pipeline leaves a signature in the access logs. reCost shows what each writer produces and each reader consumes - so broken or silently-degraded pipelines surface as behavior changes.

Databricks workload data flows

Object-level visibility into what your Databricks jobs actually touch in S3: inputs, outputs, and the tables nobody realized a job depends on.

HOW IT CONNECTS

No agent in Spark. No listener in Databricks. Nothing in the query path.

reCost works from S3 access logs and inventory - the request-level record AWS already writes. A read-only IAM role connects it in about 5 minutes, and it never reads your data files' contents.

Read-only IAM roleNo object content accessMetadata onlyWorks with any engine
FAQ

Lakehouse observability, answered

How to monitor Delta Lake table health on S3?

reCost reads the S3 access patterns around each table's _delta_log and data files. Compaction lag, small-file accumulation, abnormal writer behavior, and reader hotspots are all visible from object-level access - no agent in Spark or Databricks required.

How to find orphaned Iceberg snapshots?

Orphaned snapshots and unreferenced data files show up as objects that are written once and never read again by any query engine. reCost correlates Iceberg metadata access with data file access to surface snapshot sprawl and dead storage.

Can I analyze Athena query patterns from S3 access logs?

Yes. Every Athena scan is a sequence of S3 GETs. reCost attributes scan traffic to tables and prefixes, showing which tables are queried hardest, which queries scan far more than they need to, and where partitioning would pay off.

Does this work with Databricks workloads?

Yes. Databricks clusters read and write S3 like any other engine, and their behavior is fully visible in access logs. reCost shows per-workload data flows - what each job actually reads and writes - without installing anything in your workspace.

See exactly what's happening in your S3 data layer

Works with your existing AWS setup. Read-only access. No agents. No data exposure.

Book a Demo