Azure Integration
This guide provides step-by-step instructions for integrating Azure with Haltian IoT to automatically retrieve Parquet files from AWS S3 and upload them to your Azure infrastructure.
Overview
The integration uses an Azure Function App to:
- List and retrieve new Parquet files from the Haltian IoT S3 bucket
- Authenticate using Microsoft Entra ID (Azure AD)
- Upload files to your chosen destination
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#73F9C1', 'primaryTextColor': '#143633', 'primaryBorderColor': '#143633', 'lineColor': '#143633', 'secondaryColor': '#F6FAFA', 'tertiaryColor': '#ffffff', 'background': '#ffffff', 'mainBkg': '#73F9C1', 'secondBkg': '#F6FAFA' }}}%%
flowchart TB
subgraph "Haltian IoT"
S3["AWS S3 Bucket<br/><i>Parquet Files</i>"]
end
subgraph "Customer Azure"
FUNC["Azure Function App<br/><i>Scheduled Transfer</i>"]
subgraph "Destination Options"
ONELAKE["Microsoft Fabric<br/><i>OneLake</i>"]
STORAGE["Azure Storage Account<br/><i>Blob Storage</i>"]
end
end
S3 -->|"List & Get Objects"| FUNC
FUNC -->|"DFS API"| ONELAKE
FUNC -->|"Blob Service"| STORAGEDestination Options
Choose the destination that fits your analytics infrastructure:
| Option | Best For | Features |
|---|---|---|
| Microsoft Fabric OneLake | Enterprise analytics, Power BI | Unified lakehouse, built-in analytics |
| Azure Storage Account | Custom pipelines, flexibility | Lower cost, broad compatibility |
Prerequisites
Before starting, ensure you have:
Required
- Terraform (≥ 1.5.0)
- Azure CLI (
az) installed and authenticated - Azure subscription with permissions to create resource groups
- IAM credentials from Haltian for S3 bucket access
For OneLake Destination
- Microsoft Fabric capacity (existing or new)
- Global Administrator access for AAD permissions
For Storage Account Destination
- Storage Account Contributor permissions
Optional (for development)
- Python 3.10+
- Azure Functions Core Tools
S3 Access Credentials
Haltian provides two options for S3 bucket access:
| Method | Setup | Best For |
|---|---|---|
| Access Key + Secret | Haltian provides credentials | Quick setup |
| Bring-your-own IAM Role | You provide ARN to Haltian | Enterprise security policies |
If using your own IAM role, you’ll need to modify the Python code in the Azure Function to use AWS IAM role authentication, and provide the role ARN to Haltian for bucket policy configuration.
Infrastructure Setup
All costs for Azure resources (resource groups, storage accounts, Function Apps, Fabric capacity) are your responsibility. Review potential charges before proceeding.
Terraform Modules
The integration uses these Terraform modules:
| Module | Purpose |
|---|---|
azure-function/terraform | Azure Function App with dual upload modes |
infra/onelake | Fabric Capacity, Workspace, Lakehouse, AAD app |
infra/storageaccount | Storage Account with data containers |
Deployment Steps
Option A: OneLake Destination
Deploy infrastructure
Navigate to the
infra/onelakemodule and configure:- Azure subscription and tenant
- Resource group naming
- Fabric capacity (create new or use existing)
cd infra/onelake terraform init terraform plan terraform applyDeploy Function App
Configure the
azure-function/terraformmodule with:- S3 credentials (from Haltian)
- Fabric/OneLake credentials
- Upload mode:
onelake
cd azure-function/terraform terraform init terraform plan terraform apply
Option B: Storage Account Destination
Deploy infrastructure
Navigate to the
infra/storageaccountmodule:cd infra/storageaccount terraform init terraform plan terraform applyDeploy Function App
Configure with upload mode:
storage_accountcd azure-function/terraform terraform init terraform plan terraform apply
Configuration Reference
Azure Function Variables
| Variable | Description | Required |
|---|---|---|
s3_bucket_name | Haltian S3 bucket name | ✓ |
s3_access_key | AWS access key (from Haltian) | ✓ |
s3_secret_key | AWS secret key (from Haltian) | ✓ |
s3_region | S3 bucket region | ✓ |
upload_mode | onelake or storage_account | ✓ |
organization_id | Your Haltian organization UUID | ✓ |
OneLake-Specific Variables
| Variable | Description |
|---|---|
fabric_workspace_id | Fabric workspace GUID |
fabric_lakehouse_id | Lakehouse GUID |
aad_client_id | Azure AD app client ID |
aad_client_secret | Azure AD app secret |
aad_tenant_id | Azure AD tenant ID |
Storage Account Variables
| Variable | Description |
|---|---|
storage_account_name | Target storage account |
storage_container_name | Container for Parquet files |
storage_connection_string | Connection string |
Data Flow
Once deployed, the Azure Function runs on a schedule (configurable, default: hourly):
- List Objects - Queries S3 for new Parquet files since last run
- Download - Retrieves each new file from S3
- Authenticate - Uses MSAL client credentials for Azure auth
- Upload - Transfers to OneLake (DFS API) or Storage Account (Blob API)
- Track Progress - Records last processed timestamp
Verification
After deployment, verify the integration:
# Check Function App status
az functionapp show --name <function-app-name> --resource-group <rg-name>
# View recent executions
az monitor activity-log list --resource-group <rg-name> --offset 1h
# Check uploaded files (Storage Account)
az storage blob list --container-name <container> --account-name <account>
For OneLake, verify files in the Fabric workspace Lakehouse.
Cleanup
To remove all deployed resources:
cd azure-function/terraform
terraform destroy
cd ../infra/onelake # or infra/storageaccount
terraform destroy
Next Steps
Once data is flowing to Azure:
- OneLake: Create Power BI reports, run Spark notebooks, use SQL analytics
- Storage Account: Configure Azure Data Factory, Synapse, or Databricks pipelines
Troubleshooting
| Issue | Solution |
|---|---|
| Function not triggering | Check timer trigger configuration and Function App logs |
| S3 access denied | Verify IAM credentials with Haltian |
| OneLake upload fails | Check AAD app permissions and Fabric workspace access |
| Missing files | Verify organization ID matches your Haltian organization |