Question 1

How to monitor Delta Lake table health on S3?

Accepted Answer

reCost reads the S3 access patterns around each table's _delta_log and data files. Compaction lag, small-file accumulation, abnormal writer behavior, and reader hotspots are all visible from object-level access - no agent in Spark or Databricks required.

Question 2

How to find orphaned Iceberg snapshots?

Accepted Answer

Orphaned snapshots and unreferenced data files show up as objects that are written once and never read again by any query engine. reCost correlates Iceberg metadata access with data file access to surface snapshot sprawl and dead storage.

Question 3

Can I analyze Athena query patterns from S3 access logs?

Accepted Answer

Yes. Every Athena scan is a sequence of S3 GETs. reCost attributes scan traffic to tables and prefixes, showing which tables are queried hardest, which queries scan far more than they need to, and where partitioning would pay off.

Question 4

Does this work with Databricks workloads?

Accepted Answer

Yes. Databricks clusters read and write S3 like any other engine, and their behavior is fully visible in access logs. reCost shows per-workload data flows - what each job actually reads and writes - without installing anything in your workspace.

Your lakehouse, at object level

Everything your engines do to your tables is in the logs

Delta Lake & Iceberg table health

Small files & orphaned data

Athena & Spark query patterns

Pipeline read/write behavior

Databricks workload data flows

No agent in Spark. No listener in Databricks. Nothing in the query path.

Lakehouse observability, answered

How to monitor Delta Lake table health on S3?

How to find orphaned Iceberg snapshots?

Can I analyze Athena query patterns from S3 access logs?

Does this work with Databricks workloads?

See exactly what's happening in your S3 data layer