One Engine, Two Access Paths: How Arrow Flight SQL Makes a Single-Engine Lakehouse Possible
In our previous post, we broke down the five hidden costs of running two compute engines in your lakehouse — the infrastructure duplication, the cost opacity, the metadata sync bugs, the skills fragmentation, and the governance headaches. We showed that this dual-engine tax can run $40,000+ per year for a mid-size data team.
The obvious question: why not just use Spark for everything?
The honest answer has always been: because Spark cannot deliver query results to BI tools fast enough. Not because Spark cannot execute the query — it usually can — but because the last mile of data delivery through traditional JDBC/ODBC protocols is painfully slow.
Arrow Flight SQL eliminates that bottleneck. And with it, the primary architectural reason for running a second query engine disappears.