Like in any area of technological development and design, agile has made its way to data warehouse design. Over the past few decades, the agile mindset has gained momentum, becoming the standard in software and technology development. The well-known Agile Manifesto has twelve points that can be summarized as: Individuals and interactions over processes and [...]
Blogs
On June 10th 2024 Apache Iceberg tables became generally available on Snowflake. This blog will focus on creating Iceberg tables on Snowflake and illustrate how this is integrated with dbt. What are Iceberg Tables? Iceberg tables utilize the Apache Iceberg open table format, which acts as an abstraction layer over data files stored in open [...]
In a previous DataTalks,Wouter Pardon talked about how to implement CI/CD for Azure Data Factory and SQL Server, focusing on making deployments easier and improving the quality of data pipelines. In this post, he wants to dive deeper into unit testing for Azure Data Factory (ADF). It’ll explain how Azure DevOps can help you run [...]
Managing and extracting actionable insights from documents can be a time-consuming and error-prone task, especially when it comes to complex technical documents. These documents are often filled with critical information that needs to be efficiently analyzed and organized for operational decisions. That’s where Snowflake’s Document AI comes in. Snowflake’s advanced AI-powered platform is revolutionizing the [...]
Managing infrastructure manually can be both time-consuming and error-prone. Setting up components like servers and databases through a graphical user interface often leads to inefficiencies, especially at the enterprise level. This becomes even more challenging when dealing with multiple environments for development, user acceptance testing, validation, and production. What is Infrastructure-as-Code? Lets introduce Infrastructure-as-Code, a [...]
dbt, the Data Build Tool that has become the foundation for organizations managing data transformations. In our previous blog about dbt Mesh, we have unraveled the potential of dbt Mesh and its ability to converge different dbt functions, which enables scalability and security within and across projects. In this blog, we will discuss its evolution [...]
In today’s data-driven world, the ability to harness advanced language models is crucial for organizations aiming to leverage their data for actionable insights. While many organizations excel at managing and analyzing structured data, they often overlook a treasure trove of information contained in unstructured data. This unstructured data—ranging from emails and social media posts to [...]
We’ve already established the synergy of dbt & Databricks. However, adding data warehouse automation to the mix might be the missing link in the story. The combination of dbt & databricks provides a robust solution for streamlining data workflows and enhancing data lakehouse architectures. It leverages Databricks’ versatility in data processing and machine learning with [...]
Organizations are increasingly reliant on data warehouses to store, manage, and analyze vast amounts of information. However, the success of a data warehouse hinges not only on the quality of the data but also on the effectiveness of its underlying data model. Effective data modeling lays the foundation for a robust data warehouse architecture, enabling [...]
What is dbt Power User? The dbt Power User extension accelerates dbt development within VS Code by seamlessly integrating dbt functionalities with the editor. After installation, only VS Code is required for dbt development. With the extension, users can run model SQL code or segments of it, compile Jinja into SQL and take advantage of [...]
Browse categories
Browse tags
AI
Airflow
automation
Azure
Challenges
Cloud
Coding
Customer 360
data analytics
Databricks
data engineering
data integration
Data modelling
Data platform
Data Science
Data science lifecycle
data streaming
data transformation
data vault
data visualization
dbt
dbt core
ELT
ETL
flow management
GitHub
Healthcare
Infrastructure as Code
Kafka
LLM
Machine Learning
Matillion
Microsoft
new feature
orchestration
Permifrost
Power BI
Python
security
Snowflake
Unity Catalog
update
vaultspeed