Series Overview
This is Part 4 of our series on building a hybrid data platform. If you’re joining mid-series, here are the previous articles:
- Part 1: From Something-with-Data to Data-as-a-Product - Medallion architecture and business transformation
- Part 2: Infrastructure as Code Foundation with Terraform - IaC patterns and module design
- Part 3: Domain-Driven Design for Data Engineering - Source system separation and Conway’s Law
- Part 4: Hybrid Connectivity Architecture - Integration runtimes and Azure Relay Bridge
- Part 5: Extract and Load Pipeline Evolution - Four-pipeline pattern and deletion detection
- Part 6: Data Transformation Architecture - Dual-track approach with dbt and analyst SQL
- Part 7: CI/CD as Organizational Strategy - Selective deployment and complexity placement
- Part 8: DATEV Integration Patterns - Hardcoding Clients and Embracing Failure
- Part 9: Integrating Product Telemetry - Integrating Open Telemetry Into Unified Analytics
- Part 10: RevOps Funnel Analytics - Building Bowtie GTM Metrics
Introduction
In my previous article, I’ve walked you through our hybrid data platform architecture, Infrastructure as Code foundation, and domain-driven pipeline organization. Today, I want to dive into what makes this hybrid architecture actually work: the connectivity layer that securely bridges our on-premises databases with cloud-based processing.
One of the biggest surprises in implementing this architecture was discovering what the ADF self-hosted integration runtime actually does. It’s not just a connectivity proxy but a full execution engine that processes data locally. This fundamental insight shapes everything about how we approach hybrid connectivity and has significant implications for the domain separation strategies I discussed in Part 3.
The other major technical challenge was getting dbt containers running in Azure Container Groups to connect to on-premises databases. Unlike Azure Data Factory, containers can’t use integration runtimes, so we had to implement a different approach using Azure Relay Bridge in a sidecar pattern with some interesting coordination challenges.
The Self-Hosted Integration Runtime: More Than Connectivity
Let me start with what the self-hosted integration runtime actually does, because this was the biggest revelation during our implementation and it fundamentally changes how you think about hybrid data architecture leveraging the Azure Datafactory service.
Local Execution Engine, Not Just a Proxy
When Azure Data Factory executes a copy activity through a self-hosted integration runtime, the actual data processing happens on the integration runtime machine, not in Azure. The copy activity you see in your pipeline (including all the field mappings, type conversions, and upsert logic) executes locally on-premises.
Here’s what actually happens when our Salesforce to SQL Server copy activity runs:
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {"name": "Id", "type": "String", "physicalType": "id"},
"sink": {"name": "AC_SFID", "type": "String", "physicalType": "nvarchar"}
},
{
"source": {"name": "BillingCountry", "type": "String", "physicalType": "string"},
"sink": {"name": "AC_BillingCountry", "type": "String", "physicalType": "nvarchar"}
}
],
"typeConversion": true,
"typeConversionSettings": {
"allowDataTruncation": true,
"treatBooleanAsNumber": false
}
}
This entire transformation executes on our on-premises integration runtime machine. The data flow is:
Salesforce → Integration Runtime (local processing) → On-premises SQL Server NOT: Salesforce → Azure → Integration Runtime → SQL Server
This means all the complex processing we see in our copy activities happens locally with on-premises network speeds.
Domain Separation Implications
This local execution model has important implications for the domain separation strategies I discussed in Part 3. Since the integration runtime is the physical execution engine, you can only ever use exactly one integration runtime per copy activity. This creates natural boundaries for domain organization.
In our architecture, each subsidiary gets its own integration runtime because they’re legally separate entities. But here’s the key constraint: if you need to copy data between databases served by different integration runtimes, you can’t do it in a single copy activity.
This isn’t just a theoretical limitation. For a holding company scenarios where you need to aggregate subsidiary data, you have to introduce an intermediate layer. The pattern becomes:
- Copy from Subsidiary A database to Azure Blob Storage
- Copy from Azure Blob Storage to Subsidiary B database
This constraint actually reinforces good domain separation – it forces explicit data contracts between domains rather than allowing implicit cross-database joins that could create hidden dependencies.
Shared Runtime Architecture: One Engine, Multiple Domains
Our implementation uses a shared integration runtime model where multiple source system ADFs can use the same physical runtime while maintaining their domain separation.
The Hub Model
Each environment has one “infrastructure” Data Factory that owns the self-hosted integration runtime:
resource "azurerm_data_factory_integration_runtime_self_hosted" "integration_runtime" {
name = var.SELF_HOSTED_SHARED_RUNTIME_NAME
data_factory_id = azurerm_data_factory.adf_infra_shared.id
}
Then, each source system Data Factory creates a linked integration runtime that references the shared one through RBAC authorization:
resource "azurerm_data_factory_integration_runtime_self_hosted" "linked_self_hosted_runtime" {
name = local.linked_self_hosted_runtime_name
data_factory_id = azurerm_data_factory.adf_elt_pipeline.id
rbac_authorization {
resource_id = data.azapi_resource.self_hosted_runtime_infra_shared.id
}
}
This sharing model gives us operational efficiency (one runtime to maintain per environment) while preserving domain boundaries (separate Data Factories and deployment pipelines for each source system).
Environment Strategy: Pragmatic Sharing
We made a pragmatic decision about environment isolation. Each subsidiary has:
- Production: Dedicated integration runtime for complete isolation
- Development: Shared runtime between play and test environments for cost efficiency
This balances cost and operational overhead against isolation requirements. Legal entities need hard boundaries in production, but development environments can share infrastructure for efficiency.
The Container Challenge: Why Integration Runtimes Aren’t Enough
While Azure Data Factory can use integration runtimes, dbt running in containers cannot. This forced us to implement a different connectivity approach for our transformation layer that works within the constraints of containerized execution.
Container Networking Foundation
Before diving into the solution, it’s important to understand that containers within an Azure Container Instance group share localhost networking. This means containers can communicate with each other via 127.0.0.1 without any additional network configuration – a key enabler for our sidecar pattern.
Azure Relay Bridge: TCP Tunneling Through the Cloud
Our solution uses Azure Relay Bridge, an open-source tool from Microsoft that creates secure TCP tunnels through Azure Relay Service. The architecture implements a sidecar pattern where both dbt and Azure Relay Bridge run in the same container group.
Azure Relay Bridge leverages Azure Relay’s Hybrid Connections to create secure TCP tunnels:
On-premises side: An Azure Relay Bridge instance runs as a “listener” that establishes an outbound WebSocket connection to Azure Relay Service.
Container side: Another Azure Relay Bridge instance runs as a “sender” that connects through Azure Relay to reach the on-premises listener.
Data flow: dbt container → Azure Relay Bridge (sidecar, localhost) → Azure Relay Service → Azure Relay Bridge (on-prem) → SQL Server
The beautiful aspect of this approach is that it requires only outbound HTTPS connections from both sides. No inbound firewall rules, no VPN configuration, no network infrastructure changes.
Container Coordination: Solving the Job vs. Daemon Problem
The trickiest part of our implementation was coordinating between two fundamentally different container patterns: dbt (job-oriented, runs and exits) and Azure Relay Bridge (daemon-oriented, runs indefinitely with no shutdown mechanism).
The Coordination Challenge
We had two problems to solve:
- Startup coordination: How to ensure Azure Relay Bridge is ready before dbt tries to connect
- Shutdown coordination: How to terminate the daemon when the job completes (since Azure Relay Bridge has no shutdown mechanism)
Shared Storage Foundation
Our solution uses shared Azure Storage as a coordination mechanism between containers:
volume {
name = var.DBT_CONTAINER_NAME
mount_path = "/aci/groups"
read_only = false
share_name = data.azurerm_storage_share.containergroups_share.name
storage_account_name = data.azurerm_storage_account.stacc_infra_shared.name
storage_account_key = data.azurerm_storage_account.stacc_infra_shared.primary_access_key
}
Both containers mount the same Azure Storage share and use a simple file as a semaphore for coordination.
Complete Coordination Sequence
Here’s the full timeline of how container coordination works:
T0: Container Group Starts Both containers start simultaneously, but dbt waits while Azure Bridge establishes connectivity.
T1: Azure Bridge Configuration On startup the Azure Bridge container runs a script to rewrite its configuration from templates, substituting environment variables for the specific relay connection:
---
GatewayPorts : no
LocalForward:
- RelayName: ${ENV_INFRA_SHARED_RELAY_HYCO_NAME}
BindAddress: 127.0.10.1
BindPort: 1433
HostName: ${ENV_DWH_DATABASE_HOST}
ConnectionString: ${ENV_AZBRIDGE_LOCAL_CSTRING}
PortName: db
# Inspiered by https://starkandwayne.com/blog/bashing-your-yaml/
source /dev/stdin <<<"$(
echo 'cat <<EOF >profile.yml'
cat _profile.$1.yml
)"
This creates the Azure Relay Bridge configuration file with the connection string and local forwarding rules.
./rewrite_profile.sh $ENV_AZBRIDGE_TYPE
T2: Azure Bridge Establishes Connection Azure Bridge starts and establishes the outbound WebSocket connection to Azure Relay Service:
/usr/share/azbridge/azbridge -f profile.yml &
T3: Azure Bridge Signals Readiness Once the relay connection is established, Azure Bridge creates the semaphore file:
echo "Signaling start to dbt at /aci/groups/$ENV_GROUP_NAME"
touch /aci/groups/$ENV_GROUP_NAME
T4: dbt Waits for Signal Meanwhile, dbt polls for the semaphore file:
until [ -f /aci/groups/$ENV_GROUP_NAME ]
do
echo "Waiting for azbridge at /aci/groups/$ENV_GROUP_NAME"
sleep 5
done
echo "Found azbridge, starting dbt"
T5: dbt Host File Redirection Before connecting to the database, dbt redirects the database hostname to localhost where Azure Bridge is listening:
echo "127.0.10.1 $ENV_DWH_DATABASE_HOST" >> /etc/hosts
T6: dbt Execution dbt runs its models, connecting to what it thinks is the database hostname but actually reaches Azure Bridge on localhost:
dbt run --select $ENV_DBT_MODEL
T7: dbt Signals Completion When dbt finishes (success or failure), it removes the semaphore file to signal completion:
status=$?
echo "Signaling done to azbridge"
rm /aci/groups/$ENV_GROUP_NAME
exit $status
T8: Liveness Probe Detects Completion Azure Container Instance continuously monitors the semaphore file through a liveness probe:
liveness_probe {
exec = ["cat","/aci/groups/${local.dbt_container_instance_name}"]
initial_delay_seconds = 10
period_seconds = 5
}
When dbt removes the file, the liveness probe fails.
T9: Container Group Termination The failed liveness probe causes Azure to terminate the Azure Bridge container. Since both containers have now exited and the restart policy is “Never”, the entire container group stops cleanly.
Error Handling
The coordination mechanism handles error scenarios gracefully:
# check exit status from dbt
[ $status -eq 0 ] && echo "dbt was successful" || echo "dbt failed"
# on error sleep to allow azbridge to terminate
[ $status -ne 0 ] && sleep 25
echo "Exiting with $status"
exit $status
If dbt fails, it waits 25 seconds before exiting to give Azure Bridge time to terminate cleanly, ensuring the container group shuts down properly regardless of job success or failure.
Security Through Hybrid Architecture Design
Both connectivity approaches prioritize security through architectural design rather than complex network configurations, which reduces operational overhead while improving security posture.
Outbound-Only Security Model
Neither integration runtimes nor Azure Relay Bridge require inbound firewall rules. Both establish outbound HTTPS connections to Azure services, then relay traffic through those established channels.
For the integration runtime:
- Outbound connection to Azure Data Factory for orchestration and job queuing
- Local data processing keeps sensitive data on-premises
- No inbound ports required on corporate firewall
For Azure Relay Bridge:
- Outbound WebSocket connection to Azure Relay Service
- End-to-end encryption through the relay tunnel
- No direct exposure of on-premises endpoints
Hybrid-Specific Security Considerations
The hybrid nature of our architecture creates specific security concerns that we address through design:
Data Locality: Integration runtime processing ensures that sensitive data transformations happen on-premises with only metadata and orchestration information flowing to Azure.
Tunnel Encryption: Azure Relay Bridge provides end-to-end encryption through the relay tunnel, so data in transit is protected even though it passes through Azure infrastructure.
Credential Isolation: Database credentials never leave the on-premises environment. Azure Relay Bridge on-premises side has direct access to databases, while the cloud side only has tunnel access.
Identity and Access Management
We use Azure managed identities wherever possible to eliminate credential management overhead:
identity {
type = "SystemAssigned"
}
Secrets that must be shared (like relay connection strings) are stored in Azure Key Vault and accessed through managed identity authentication:
secure_environment_variables = {
"ENV_DWH_DATABASE_PASSWD" = "${data.azurerm_key_vault_secret.dwh_database_secret.value}"
"ENV_AZBRIDGE_LOCAL_CSTRING" = "${data.azurerm_key_vault_secret.arhc_local_secret.value}"
}
This approach minimizes the attack surface by avoiding long-lived credentials in configuration files and reducing the number of secrets that require manual rotation.
Container Group Architecture in Practice
Each source system’s container group implements the sidecar pattern with domain-specific configuration:
resource "azurerm_container_group" "container_instance" {
name = local.dbt_container_instance_name
location = var.LOCATION
resource_group_name = azurerm_resource_group.rg_elt_pipeline.name
restart_policy = "Never" # Critical for proper shutdown coordination
container {
name = var.DBT_CONTAINER_NAME
image = "${data.azurerm_container_registry.cr_infra_shared.login_server}/${var.DBT_CONTAINER_NAME}:${var.DBT_CONTAINER_TAG}"
environment_variables = {
"ENV_GROUP_NAME" = "${local.dbt_container_instance_name}"
"ENV_DBT_MODEL" = "${var.NAME}" # Source system identifier
"ENV_DWH_DATABASE_HOST" = "${local.dbt_db_host}"
"ENV_DWH_DATABASE_NAME" = "${local.dbt_db_database}"
}
}
container {
name = var.AZBRIDGE_CONTAINER_NAME
image = "${data.azurerm_container_registry.cr_infra_shared.login_server}/${var.AZBRIDGE_CONTAINER_NAME}:${var.AZBRIDGE_CONTAINER_TAG}"
environment_variables = {
"ENV_GROUP_NAME" = "${local.dbt_container_instance_name}"
"ENV_AZBRIDGE_TYPE" = "LocalForward"
"ENV_DWH_DATABASE_HOST" = "${var.DWH_DATABASE_HOST}"
"ENV_INFRA_SHARED_RELAY_HYCO_NAME" = "${local.infra_shared_relay_hyco_name}"
}
liveness_probe {
exec = ["cat","/aci/groups/${local.dbt_container_instance_name}"]
initial_delay_seconds = 10
period_seconds = 5
}
}
}
The key architectural decisions here are:
- restart_policy = “Never”: Ensures clean shutdown after job completion
- Shared environment variables: Both containers use the same group name for coordination
- Liveness probe: Monitors the coordination file to detect job completion
- Domain-specific model selection: Each container group runs models for its source system
Operational Lessons and Performance Characteristics
Running this hybrid architecture in production has taught us several important lessons about operational considerations and performance implications.
Integration Runtime Performance
The local execution model of integration runtimes provides excellent performance for data movement between on-premises systems. We see near-native network speeds for database-to-database copies, since the data never leaves our local network for processing.
Container Coordination Reliability
The file-based coordination mechanism works reliably, but we’ve learned to monitor for edge cases:
- Startup failures: If Azure Relay Bridge can’t establish its connection (network issues, configuration problems), the semaphore file is never created and dbt waits indefinitely
Shared Runtime Maintenance
The shared integration runtime creates operational efficiency but requires coordinated maintenance. When we need to update the runtime or the underlying machine, we schedule maintenance windows during low-activity periods.
Evolution and Future Considerations
This connectivity architecture serves our current needs well, but we’ve designed it to evolve with organizational and technical growth.
Scaling Beyond Single Runtimes
As data volumes and domain complexity grow, we may need to move beyond the shared runtime model. The template-based approach makes it straightforward to deploy dedicated runtimes for specific domains when performance or isolation requirements change.
The decision point will likely be performance rather than organizational – when shared runtime capacity becomes a bottleneck, we can easily instantiate dedicated runtimes using the same Terraform modules.
Container Orchestration Evolution
Our current file-based coordination works well for the two-container sidecar pattern, but more complex scenarios might benefit from proper container orchestration platforms like Azure Container Apps.
If we need to add additional containers (monitoring, specialized connectors) or implement more sophisticated coordination patterns, we could evolve toward using more advanced orchestration while keeping the same basic architectural patterns.
Lessons Learned: Pragmatic Hybrid Connectivity
Implementing this connectivity architecture taught us several important lessons about building pragmatic hybrid systems.
Embrace Constraints as Design Features
The integration runtime’s local execution model initially seemed like a limitation for domain separation. Instead of fighting it, we embraced it as a design constraint that actually reinforces good domain boundaries and provides excellent performance characteristics.
Similarly, the container coordination challenge led us to a solution that’s simpler and more reliable than complex orchestration would have been.
Shared Infrastructure Enables Domain Autonomy
The shared integration runtime approach demonstrates that you can share infrastructure while maintaining domain independence. The key is ensuring that sharing happens at the infrastructure layer (physical resources) while maintaining separation at the operational layer (deployments, ownership, governance).
Security Through Architecture Beats Configuration Complexity
Both connectivity approaches achieve security through architectural design (outbound-only connections, data locality, managed identities) rather than complex network configurations. This makes the system more secure by default and reduces operational overhead.
Simple Coordination Mechanisms Work at Scale
The file-based semaphore approach might seem crude compared to sophisticated orchestration systems, but it’s proven reliable, debuggable, and maintainable. Sometimes the simplest solution that works is the best solution.
Conclusion and Looking Forward
Hybrid connectivity is often seen as a necessary complexity for organizations that can’t move everything to the cloud immediately. Our experience suggests it can be an architectural strength when implemented thoughtfully.
The self-hosted integration runtime’s local execution model provides excellent performance and natural domain boundaries. The Azure Relay Bridge sidecar pattern solves the container connectivity challenge elegantly without requiring complex network infrastructure. Both approaches prioritize security and simplicity over configuration complexity.
Most importantly, this connectivity architecture enables the domain-driven organizational evolution I discussed throughout this blog series. Source system boundaries are clear but not rigid. Infrastructure is shared but governance is distributed. Teams can operate independently while benefiting from shared operational expertise.
In my next article, I’ll dive into our extract and load strategies. I’ll share how we solved the deletion detection problem with our three-pipeline pattern, explore different API pagination approaches, and discuss the trade-offs between data freshness and consistency that drive our pipeline design decisions.
The goal isn’t to create the perfect hybrid connectivity solution from day one, but to establish patterns that work well now and can evolve naturally as organizational and technical needs change.
