Databricks Apps: Bridging the Gap Between Data and Operations

The data industry has been focused on optimizing backend infrastructure, centralizing information into Data Lakes and Warehouses to break down data silos. While the ability to report on this data has improved significantly, we’ve encountered a “last mile” problem. Although data is easily accessible for analysis, delivering it into actionable, interactive user experiences has proven to be more complex. This is especially true when the use case extends beyond simple dashboards, where users require flexibility to engage with data in ways that dashboards alone can’t provide. Many organizations have relied on traditional BI tools like Power BI, Tableau, or Looker, which excel at reporting but often shoehorn end users into a limited set of interactions. These tools are built around predefined visual components, or “Lego blocks,” which work well for basic bar charts, tables, and KPIs but fall short when the business requires more complex, interactive, or domain-specific applications.

Databricks Apps represents a paradigm shift in this architecture. By embedding a serverless application runtime directly within the data platform, it allows data teams to build and deploy secure applications that offer full flexibility, without the overhead of managing external cloud infrastructure like EC2 or App Services. This removes the need to compromise on user experience, allowing for dynamic, customized, and interactive interfaces that can handle a wide range of use cases, from transactional workflows to highly specialized visualizations.

This development is a big shift for data teams. It introduces a new challenge: deciding when to use simple tools like Streamlit and Dash for quickly building interactive apps, and when to invest in more complex full-stack architectures for critical, high-performance applications. The key is to choose the right approach based on the specific needs of the project.

Unified Governance and Security

One of the strongest arguments for moving applications “inside” the data platform is security. Traditionally, building an internal tool meant replicating access controls across separate systems. You might define row-level security in your Data Warehouse, but when you spin up a custom web application on an external service, you have to rebuild those same rules in the new system, often introducing “governance drift.”

Databricks Apps solves this problem with its seamless Unity Catalog integration. The application runs as a Service Principal (or even your own account with the newly added on-behalf-of-user identification), governed by the same access policies as any other user within the platform. If a row-level or column-level masking policy exists on a table, the app automatically inherits it. This removes the need to replicate security rules and ensures that your application is a true “native citizen” of the Lakehouse, not an external satellite.

Authentication

Security is often the hardest part of internal tool development. Databricks Apps integrates with your workspace’s identity provider out-of-the-box. Users authenticate through corporate SSO before accessing the app, eliminating the need for custom login logic and ensuring consistent security across all tools.

Serverless

Databricks Apps also takes away the operational burden of managing compute resources. The platform handles provisioning, patching, and scaling, supporting “scale-to-zero”. This means that internal tools only use resources when they are actively in use. Tools used periodically, like month-end reporting applications, will cost nothing when idle.

Choosing the Right App Architecture

Databricks Apps supports two distinct architectural pathways, each suited to different types of applications. Choosing the wrong approach can result in unnecessary complexity and technical debt.

Path 1: The Rapid Prototyping Path (Streamlit)

This pathway leverages “UI-as-code” frameworks, enabling data engineers to rapidly convert analysis scripts into interactive web applications using Python alone. No knowledge of HTML, CSS or JavaScript is required.

  • How it works:
    The app re-executes the script from top to bottom on every user interaction, relying on the serverless compute plane to render the UI.
  • Ideal Use Cases:
    • GenAI Interfaces: Building quick interactive chatbot interfaces for RAG (retrieval-augmented generation) applications.
    • Exploratory Data Analysis: Allowing analysts to slice and dice datasets interactively.
    • Monitoring Dashboards: Custom views for tracking model drift or data quality issues
  • Limitations:
    While development is fast, the app can struggle with high concurrency, and managing state can be difficult. Additionally, the “rerun-on-interaction” model can create challenges with responsiveness for large datasets.

Alternative frameworks like Dash address many of these concurrency and state limitations by using a different execution model. We will revisit these advanced options and how they differ in a future post.

Path 2: The Full-Stack Path (React, FastAPI, & Lakebase)

For more mission-critical applications, Databricks Apps also supports a decoupled architecture: React (or any other frontend javascript framework) on the frontend with FastAPI/Flask on the backend. This approach offers more granular control over the user experience and performance, enabling the creation of interactive and transactional applications.

  • How it works: The frontend runs client-side in the user’s browser, offering instant UI updates. The backend handles business logic asynchronously, allowing high concurrency.
  • Ideal Use Cases:
    • Operational Workflows: Building applications for inventory management, supply chain adjustments, or ticket bookings.
    • Human-in-the-Loop Systems: Applications where user input is integral to the process, such as data annotation or approval workflows.
    • Bespoke Visualizations: Use cases requiring more complex interactions or visualizations than what BI tools can offer (e.g., 3D models, interactive Gantt charts).

Solving the OLAP vs. OLTP Friction with Lakebase

One of the most transformative features of Databricks in combination with Apps is Lakebase, a managed PostgreSQL database built into the platform. Data Lakes (OLAP) are optimized for large-scale data analytics and high-throughput scanning. However, they are not well-suited for transactional workloads (OLTP), which require sub-millisecond lookups and low-latency updates.

Lakebase bridges this gap by enabling the following workflow:

  1. Heavy Processing in Delta Lake: Complex ETL jobs process data in Delta Lake, producing “Gold” tables.
  2. Synced Tables in Lakebase: These Gold tables are automatically synced to read-only PostgreSQL tables in Lakebase.
  3. Low-Latency Queries: Applications can query these synced tables for millisecond-level response times.
  4. Write-Back: User actions (e.g., approving a request) are written to a writable table in Lakebase and re-ingested back into the Data Lake for auditing.

This architecture eliminates the need for external Reverse ETL tools, keeping everything within Databricks’ secure, governed environment. It also significantly improves application performance without sacrificing governance or security.

Deployment and Lifecycle

Building and deploying Databricks Apps is streamlined with Databricks Asset Bundles (DABs). These bundles define the infrastructure, permissions, and code in a YAML configuration, which ensures consistency across environments.

  • Development: Developers build locally using VS Code and the Databricks CLI.
  • CI/CD: Continuous integration and deployment are automated through Git-based workflows.
  • Production: Consistent, version-controlled deployments ensure that schema migrations and permission grants are handled smoothly.

By defining the entire stack as code, teams can streamline the deployment process and ensure a smooth transition from prototype to production.

Conclusion

Databricks Apps is transforming the way organizations build and deploy data applications. By embedding a serverless runtime within the data platform, it eliminates the need for external infrastructure and allows teams to build interactive, user-facing applications that are governed by the same policies as their underlying data.

For architects and engineers, the key to success is understanding which architectural pattern to apply to your use case. Use Streamlit and similar frameworks to rapidly prototype analytical tools and dashboards. When you need a more robust, transactional application, leverage React, FastAPI, and Lakebase to build scalable, enterprise-grade solutions. By aligning your architecture with the needs of the application, teams can seamlessly close the gap between data analytics and operational systems, without compromising on security or governance.

// Related articles

Read our other posts

// CONTACT US

Want to know more?