Skip to main content

3 posts tagged with "serverless"

View All Tags

The Serverless Black Box: What You Lose on Databricks Serverless Compute

· 13 min read
Cazpian Engineering
Platform Engineering Team

The Serverless Black Box: What You Lose on Databricks Serverless Compute

Databricks serverless compute promises a simple deal: stop managing clusters and just run your workloads. No instance selection. No autoscaling policies. No driver sizing. Just submit your query or job and let Databricks handle the rest.

The pitch is compelling. The reality is a black box that removes not just infrastructure management, but your ability to observe what is happening, tune how it runs, and control what it costs.

This is Part 3 of our Databricks observability series. In the previous post, we documented how system tables leave critical metrics gaps. Serverless makes those gaps dramatically worse — because on serverless, you lose even the tools that classic compute provides.

Databricks System Tables: The Observability Gap — What They Expose vs What You Actually Need for Cost Control

· 17 min read
Cazpian Engineering
Platform Engineering Team

Databricks System Tables: The Observability Gap

Databricks system tables look comprehensive on paper. Sixteen tables across ten schemas. Billing, compute, jobs, queries, lineage, audit. When Databricks deprecated Overwatch and pointed everyone to system tables, the message was clear: this is the future of observability.

But if you have ever tried to answer these questions using only system tables, you already know the gap:

  • Why is my job's GC time at 15%? (System tables do not track GC time.)
  • Which stage is spilling to disk? (System tables do not track per-stage spill.)
  • Which executor is the memory bottleneck? (System tables do not track executor-level JVM metrics.)
  • How many files does my Delta table have? Is it healthy? (System tables do not track Delta table physical metrics.)
  • What did my serverless job's infrastructure look like? (System tables do not populate node_timeline for serverless.)

This post documents exactly what Databricks system tables contain — column by column — and exactly what is missing. Every claim is verifiable against Databricks' own documentation.

Zero Cold Starts: How Cazpian Compute Pools Cut Your Spark Bills in Half

· 11 min read
Cazpian Engineering
Platform Engineering Team

Zero Cold Starts: How Cazpian Compute Pools Cut Your Spark Bills in Half

In Part 1 of this series, we exposed the Small Job Tax — the hidden cost of cold starts, overprovisioned clusters, and per-job infrastructure overhead that silently drains data budgets. We showed that for many teams, more than half of their Spark compute spend goes to infrastructure bootstrapping, not actual data processing.

The natural follow-up question: what if you could eliminate that overhead entirely?

That is exactly what Cazpian Compute Pools are built to do.