top of page

Managing Snowflake Infrastructure-as-Code

  • Writer: Digital Hive
    Digital Hive
  • Apr 10
  • 4 min read

DataBricks Certified Data Engineer Associate Exam

Modern data engineering teams typically rely on tools like dbt to manage data transformations through version-controlled, CI/CD-integrated pipelines. However, the underlying Snowflake infrastructure databases, schemas, virtual warehouses, and role-based access control (RBAC) is frequently provisioned using imperative SQL scripts or manual UI configurations.


While many engineering teams utilize third-party Infrastructure-as-Code (IaC) solutions like Terraform, Pulumi, ect... to manage these objects, these tools often introduce their own operational overhead, requiring external state file management, cross-platform authentication, and proficiency in domain-specific languages.


Snowflake Declarative Configuration Management (DCM) addresses this gap by providing a native, integrated IaC framework. This guide details how to implement DCM to manage a Medallion Architecture and outlines the architectural boundary between DCM and dbt.

Declarative State Management vs. Imperative Scripting

The traditional approach to Snowflake management relies on imperative execution (CREATE OR REPLACE, ALTER). Imperative scripts are inherently brittle. They require the developer to explicitly define the sequence of state changes.


DCM shifts this to a declarative model. You define the desired end-state of the Snowflake environment in local project files. During execution, the Snowflake CLI evaluates the delta between your local definitions and the current state of the target Snowflake account. It then dynamically generates and executes the necessary Data Definition Language (DDL) operations—CREATE, ALTER, or DROP—to achieve the target state.


This ensures idempotency. Repeated deployments of the same DCM project will yield no changes if the target environment is already synchronized with the codebase.


Architectural Boundary: DCM and dbt

Effective data platform management requires a strict separation between infrastructure provisioning and data modelling:


Infrastructure (DCM): DCM provisions the databases, schema structures (L0_RAW through L3_GOLD), virtual warehouses, roles and grants.


Logic (dbt): Executing DML to materialize business logic into the tables and views within the DCM-established schemas.


This separation ensures your transformation layer operates within a secure, audited, and version-controlled environment without infrastructure and logic competing for the same state.


Implementation Guide of a Medallion Architecture via DCM

The following steps demonstrate how to initialize a DCM project and provision a multi-environment Medallion Architecture.


Prerequisites

  • Python environment.

  • Snowflake CLI installed (pip install snowflake-cli-labs).

  • CLI connection configured in ~/.snowflake/config.toml (e.g., utilizing externalbrowser for SSO).


Step 1: Workspace Initialization

DCM enforces a standardized directory structure. The CLI specifically targets the sources/definitions/ directory for SQL definitions.


# Initialize project directories 
mkdir snowflake_cdm && cd snowflake_cdm 
mkdir -p dcm/sources/definitions
# Initialize virtual environment and install CLI 
python -m venv .venv 
source .venv/bin/activate  # Windows: .venv\Scripts\activate 
pip install snowflake-cli-labs 
# Enable the DCM preview feature flag 
export SNOWFLAKE_CLI_FEATURES_ENABLE_SNOWFLAKE_PROJECTS=true 

Step 2: Manifest Configuration

The manifest.yml file dictates project metadata, deployment targets, and templating variables. It maps local execution contexts to persistent DCM Project objects within Snowflake.


File: dcm/manifest.yml


manifest_version: 2 
type: DCM_PROJECT 
default_target: dev 
  
targets: 
  dev: 
    project_name: DCM_ADMIN.PROJECTS.DCM_DEV 
    templating_config: dev_config 
  prod: 
    project_name: DCM_ADMIN.PROJECTS.DCM_PROD 
    templating_config: prod_config 

templating:
  configurations: 
    dev_config: 
      db_name: ANALYTICS_DEV 
    prod_config: 
      db_name: ANALYTICS_PROD 

Note: when using VSCode, the integrated YAML server may incorrectly flag the manifest file against the Snowflake Native App schema. To resolve this, change the file's Language Mode in the status bar from 'Snowflake Application Package Manifest' to 'YAML'.


Step 3: Declarative Object Definition

Infrastructure is defined using the DEFINE keyword rather than CREATE. Jinja2 templating is utilized to inject environment-specific variables defined in the manifest configuration.


File: dcm/sources/definitions/medallion.sql


-- Provision target database 
DEFINE DATABASE {{ db_name }}; 

-- Provision standard Medallion schemas 
DEFINE SCHEMA {{ db_name }}.L0_RAW; 
DEFINE SCHEMA {{ db_name }}.L1_BRONZE; 
DEFINE SCHEMA {{ db_name }}.L2_SILVER; 
DEFINE SCHEMA {{ db_name }}.L3_GOLD; 

-- Provision base ingestion table 
DEFINE TABLE {{ db_name }}.L0_RAW.HEARTBEAT ( 
    device_id VARCHAR, 
    payload VARIANT, 
    ingested_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP() 
); 

Step 4: Plan and Deploy Lifecycle

The DCM deployment lifecycle consists of two distinct operations: validation and execution.


State Diff Generation (Plan)

The plan command performs a dry run, outputting the computed DDL required to align the target Snowflake environment with your local definitions.

cd dcm 
snow dcm plan --target dev 

Execution (Deploy)

Once the deployment plan is verified, the deploy command applies the DDL transactions to the Snowflake account.

snow dcm deploy --target dev 

Environment Promotion

To promote the infrastructure architecture to production, execute the deployment utilizing the prod target. The Snowflake CLI will automatically resolve the prod_config templating variables (e.g., ANALYTICS_PROD) and provision the isolated environment.

snow dcm deploy --target prod 

Strict State Enforcement and Idempotency

It is vital to understand the operational impact of declarative IaC. The local repository serves as the absolute source of truth.


If an engineer manually executes a CREATE TABLE statement within the targeted Snowflake database (out-of-band change), that table is fundamentally orphaned from the IaC state. Upon the next execution of snow dcm deploy, the DCM engine will identify the unmanaged object and automatically issue a DROP statement to enforce the exact state defined in the codebase.


This behavior eliminates environment drift but strictly mandates that all infrastructure changes proceed through the established GitOps workflow.


Future DataOps

Looking forward, DCM is the foundation for a unified Snowflake DataOps platform, potentially integrating with Native Git for automated deployments and Dynamic Tables for declarative orchestration. By adopting DCM, you future-proof your stack for a code-centric Snowflake ecosystem.


Daan Vandenreyt, Data Engineer at Digital Hive.


Written by Aslan Hattukai

Data Engineer




 
 
 

Comments


bottom of page