The Problem: Azure’s Pricing Page Is a Trap
You type “cheapest Azure VM” into Google, click the first result, and land on the Azure pricing calculator. Forty-seven dropdowns. Filters for region, OS, tier, reservation term, currency, and a category called “workload type” that doesn’t map to anything you actually run. Twenty minutes later you’ve closed the tab and gone back to whatever you were doing. I’ve done this more times than I want to admit.
The pricing page is designed to let Azure sell you the right VM for enterprise workloads. It is not designed to help a DevOps engineer quickly figure out what to spin up for a self-hosted GitHub Actions runner or a throwaway dev box. The B-series and D-series vms both look “cheap” until you realize one of them has no burstable CPU and the other costs $0.02/hour more than you expected because you accidentally left “Windows” selected in the OS dropdown.
Here’s the framing that actually matters: the cheapest VM is not the one with the lowest hourly rate — it’s the one that won’t choke on your pipeline halfway through a Docker build or an npm install on a monorepo. A B1s at $0.011/hour sounds great until your pipeline starts hitting CPU throttling every 15 minutes because burstable credits ran out. You didn’t save money; you just added latency and flaky builds to your week.
Docker Is Not the Only Option Anymore: What I Actually Run Containers With in 2026
This article is specifically about the workloads DevOps engineers actually run on small VMs:
- CI runners — self-hosted agents for Azure DevOps or GitHub Actions
- Dev/test machines — boxes you SSH into, run experiments on, and nuke when you’re done
- Self-hosted agents — long-running processes that pull jobs from a queue
- Small infra boxes — Prometheus scrapers, Nginx proxies, Tailscale exit nodes, that kind of thing
This is not a guide for choosing a VM to run your production API or your PostgreSQL database under real traffic. The failure modes are totally different and the sizing logic doesn’t transfer. Also, a hard disclaimer: Azure changes prices, introduces new SKUs, and quietly retires old ones. Every number in this article was pulled from the Azure pricing page, but you need to verify on the Azure pricing page before you commit to anything — especially if you’re planning reserved instances or spot pricing. For developers also folding AI tooling into their workflow while trimming infrastructure costs, the Best AI Coding Tools in 2026 (thorough Guide) is worth a look alongside this.
The Short Answer: B-Series Burstable VMs Are Your Starting Point
The B-series exists because Microsoft noticed that most VMs sit idle 80% of the time. For DevOps tooling — GitLab runners, ArgoCD, small Ansible control nodes, monitoring agents — that’s completely true. Your Standard_B1s at $0.0104/hour (~$7.59/month) or Standard_B2s at $0.0416/hour (~$30/month) will handle the vast majority of CI/CD workflows without breaking a sweat, because the workload pattern matches exactly how burstable credits work.
The credit mechanism is genuinely clever once you internalize it. Every minute your VM runs below its baseline CPU, it banks credits. Standard_B1s earns 6 credits/hour at idle and has a max bank of 144 credits — that’s 24 hours of full accumulation. Run a 10-minute Terraform plan that pegs CPU at 100%? You burn maybe 8-10 credits, then the VM spends the next hour quietly replenishing. That maps perfectly to the “pipeline fires, completes, idles” pattern of most DevOps agents.
# Check your current CPU credit balance on a running B-series VM
# This is the number people forget to monitor until things slow down
az vm get-instance-view \
--resource-group my-rg \
--name my-b1s-runner \
--query "instanceView.extensions" \
-o table
# Better: pull the metric directly
az monitor metrics list \
--resource /subscriptions/{sub-id}/resourceGroups/my-rg/providers/Microsoft.Compute/virtualMachines/my-b1s-runner \
--metric "CPU Credits Remaining" \
--interval PT1M \
--output table
The thing that will catch you off guard: the baseline for Standard_B1s is 10% of one vCPU. Not 10% utilization in a soft sense — if you exhaust your credit bank and your pipeline is still running, you’re capped at 100MHz of effective compute. I’ve seen a terraform apply against a moderately complex state file go from 45 seconds to over 8 minutes under credit starvation. The fix isn’t complicated — you need to monitor CPU Credits Remaining in Azure Monitor and set an alert at 20 credits — but nobody does this until they’ve been burned once.
Where B-series holds up fine vs. where it collapses:
- Works great: GitLab/GitHub runners for jobs under 5 minutes, ArgoCD controller on small clusters, Ansible control nodes running scheduled playbooks, Prometheus with light scrape loads (<500 targets), HashiCorp Vault in dev/staging
- Marginal: Jenkins master node if you have more than 3-4 concurrent jobs triggering, Terraform Cloud agent running multiple workspaces back-to-back without cool-down, Docker builds of images with heavy compilation steps
- Will ruin your day: Any sustained workload — SonarQube analysis, large Packer builds, running a Nexus or Artifactory repository that serves frequent pulls, K3s node that actually handles real traffic
Standard_B2ms (2 vCPU, 8GB RAM, ~$40/month) is the sweet spot I keep landing on for a general-purpose DevOps control plane. The memory matters more than the CPU ceiling for most tooling — GitLab Runner with Docker executor, a small k3s single-node, or a Vault+Consul pair all appreciate 8GB. The baseline is 60% of 2 vCPUs, so you’d have to be running something genuinely CPU-intensive continuously to hit credit exhaustion. For most DevOps workloads, you never will.
The Actual Lineup: What Each Size Gets You
The B-series pricing jumps aren’t linear — going from Standard_B1s to Standard_B2ms roughly 4x’s your RAM while only doubling your cost. That asymmetry matters when you’re choosing what can run on your cheapest viable machine versus what actually needs more headroom.
Standard_B1s (1 vCPU, 1GB RAM) is legitimately useful for a narrow set of things: a cron-triggered Azure Function alternative, a webhook receiver that forwards to a queue, a Prometheus exporter sidecar, or a tiny Ansible runner that SSHs somewhere else and does the heavy lifting there. I’ve run Gitea on one of these and it works fine for a solo dev. What doesn’t work: anything that loads a Docker daemon, any build that pulls Node modules, anything with a JVM. You’ll OOM before the build finishes. Baseline CPU is 10%, which means you burst off a small credit pool — run it pegged for more than a few minutes and you’ll get throttled.
Standard_B1ms (1 vCPU, 2GB RAM) gives you enough room to run a small Go or Python service with some actual heap, or host a lightweight CI webhook listener that does real parsing. Still single-core, so parallelism is zero. The extra gig of RAM means you stop fighting the OOM killer constantly, but you’re still one npm install away from trouble. Monthly cost in East US is around $17-18 with pay-as-you-go, ~$12 with a 1-year reserved instance.
Standard_B2s (2 vCPU, 4GB RAM) is where self-hosted Azure DevOps agents become actually viable. I run lightweight pipeline agents here — the kind that clone a repo, run unit tests, publish an artifact. Two cores means you’re not fully serialized. The 4GB ceiling still bites you if your pipeline does docker build on anything with a multi-stage Dockerfile that pulls a heavy base image. Baseline CPU is 40% (shared across 2 vCPUs), so you have a reasonable burst budget for pipeline spikes. Around $35/month pay-as-you-go in East US.
Standard_B2ms (2 vCPU, 8GB RAM) is the honest minimum if your pipelines do any combination of Helm templating, Terraform plan/apply, or container image builds. The RAM doubling over B2s is what unlocks this — Terraform with a complex state file and provider cache will eat 3-4GB easily. Helm with a large values file and several dependencies will do the same. I switched our agent pool from B2s to B2ms after watching pipeline agents get OOM-killed during terraform init on a module-heavy repo. Around $70/month pay-as-you-go, drops to roughly $44/month with a 1-year reservation.
Standard_B4ms (4 vCPU, 16GB RAM) is a different category. This isn’t an agent — this is where you run Jenkins controllers, small GitLab Runner managers, or a single-node k3s cluster that actually does something. Four cores means you can handle concurrent builds without completely serializing. 16GB means a Jenkins controller with a dozen plugins and a handful of concurrent builds won’t keel over. Around $140/month pay-as-you-go in East US. If you’re running this just as an Azure DevOps agent, you’re over-provisioned — but if it’s running a controller plus agents, the math works.
| Size | vCPU | RAM | Baseline CPU% | ~Monthly (East US PAYG) | Best DevOps Use Case |
|----------------|------|-------|---------------|--------------------------|---------------------------------------------|
| Standard_B1s | 1 | 1 GB | 10% | ~$9 | Cron jobs, webhook forwarders, tiny agents |
| Standard_B1ms | 1 | 2 GB | 20% | ~$17 | Lightweight runners, small Go/Python daemons|
| Standard_B2s | 2 | 4 GB | 40% | ~$35 | ADO agent for test-only pipelines |
| Standard_B2ms | 2 | 8 GB | 40% | ~$70 | ADO/GitHub Actions agent with real builds |
| Standard_B4ms | 4 | 16 GB | 40% | ~$140 | Jenkins controller, k3s node, runner manager|
The baseline CPU percentage column is what people ignore and then wonder why their B1s VM feels slow all day. That 10% means the VM is only guaranteed 10% of a physical core continuously — the rest is burst credit. If you’re running a persistent service that needs consistent CPU (not just occasional spikes), the burstable B-series will disappoint you. For bursty pipeline work where the VM idles between jobs and spends burst credits between runs, it’s a perfect fit.
Spinning One Up With Azure CLI (Not the Portal)
Skip the portal. Every click you make in the Azure web UI is a step you can’t reproduce, review in a PR, or automate later. I provision every dev/test VM from the CLI now, and the whole thing takes under three minutes once your account is set up.
First, get the CLI installed and pointed at the right subscription. On Ubuntu/Debian it’s a one-liner via Microsoft’s repo, or brew install azure-cli on Mac. Then:
# Authenticate — opens a browser tab, handles MFA automatically
az login
# If you have multiple subscriptions, pin the right one explicitly
az account set --subscription "your-subscription-id-or-name"
# Verify you're pointing at the right place before spending money
az account show --query "{name:name, id:id, state:state}"
That last command has saved me from provisioning into the wrong subscription more than once. Takes two seconds. Do it every time.
Now create a resource group and the VM itself. I put everything in the same region to avoid cross-region egress charges, and I name resource groups after their purpose, not their contents — you’ll thank yourself when you have six of them.
# Logical container for everything in this lab
az group create --name devops-lab-rg --location eastus
# Provision the VM — this takes ~60-90 seconds
az vm create \
--resource-group devops-lab-rg \
--name devops-agent-01 \
--image Ubuntu2204 \
--size Standard_B2ms \
--admin-username azureuser \
--ssh-key-values ~/.ssh/id_rsa_azure.pub \ # explicit path — see gotcha below
--public-ip-sku Standard
# Output will include publicIpAddress — copy that immediately
The gotcha that will ruin your morning: if you use --generate-ssh-keys instead of --ssh-key-values, Azure CLI will silently overwrite ~/.ssh/id_rsa if it already exists. No warning, no prompt. I lost access to two other servers because the key pair got replaced mid-session. Always generate your key separately first (ssh-keygen -t ed25519 -f ~/.ssh/id_rsa_azure) and pass the public key path explicitly with --ssh-key-values.
Once the VM is up, open port 22 — or don’t, if your org mandates Bastion:
# If you're allowed to expose SSH directly (fine for personal dev work):
az vm open-port --port 22 --resource-group devops-lab-rg --name devops-agent-01
# If your org uses Azure Bastion, skip the above entirely.
# Bastion connects over HTTPS/443 through the portal or CLI:
az network bastion ssh \
--name your-bastion-name \
--resource-group devops-lab-rg \
--target-resource-id $(az vm show -g devops-lab-rg -n devops-agent-01 --query id -o tsv) \
--auth-type ssh-key \
--username azureuser \
--ssh-key ~/.ssh/id_rsa_azure
After provisioning, run an instance view check to confirm the VM is actually running and to see the billing model Azure applied:
az vm get-instance-view \
--resource-group devops-lab-rg \
--name devops-agent-01 \
--query "{status:instanceView.statuses[1].displayStatus, vmSize:hardwareProfile.vmSize}"
# Expected output:
# {
# "status": "VM running",
# "vmSize": "Standard_B2ms"
# }
One thing that surprises people: this command doesn’t show your credit balance or current cost — that’s in Azure Cost Management, not the VM instance view. What it does tell you is whether the VM is in a running state (and therefore billing you) versus deallocated. A deallocated B2ms stops the compute charge. A stopped VM (powered off but not deallocated) still bills you. Run az vm deallocate --resource-group devops-lab-rg --name devops-agent-01 at end of day if you’re not on a tight budget.
Setting It Up as a Self-Hosted Azure DevOps Agent
Skip the Azure portal wizard. SSH into your VM and do this directly — it’s faster and you end up with a setup you actually understand rather than one you clicked through.
# Create a working directory for the agent
mkdir myagent && cd myagent
# Download the agent — check the latest version at aka.ms/azdo-agent-release
# As of writing, 3.236.1 is current for Linux x64
curl -LsS https://vstsagentpackage.azureedge.net/agent/3.236.1/vsts-agent-linux-x64-3.236.1.tar.gz \
| tar -xz
# Configure unattended — replace the org, PAT, and agent name
./config.sh \
--unattended \
--url https://dev.azure.com/your-org \
--auth pat \
--token ghp_yourPAThere \
--pool Default \
--agent devops-agent-01
The --unattended flag is the one most guides skip. Without it, config.sh drops you into an interactive prompt mid-script, which kills any automation. Generate the PAT in Azure DevOps under User Settings → Personal Access Tokens with Agent Pools (read & manage) scope — nothing broader than that.
Once configured, register it as a systemd service so it survives reboots:
sudo ./svc.sh install
sudo ./svc.sh start
# Confirm it's actually running
sudo ./svc.sh status
The advice to just use the Default pool sounds fine on paper. I burned time on this: if you have any other agents registered — even old ones from a previous project — your pipeline jobs can land on the wrong machine. Create a named pool in Azure DevOps under Project Settings → Agent Pools, then target it explicitly in your YAML:
# In your azure-pipelines.yml
pool:
name: my-cheap-pool # not 'Default'
steps:
- script: echo "Running on the right box"
If your pipelines build Docker images, install Docker on the agent and add the service user to the docker group:
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker azureuser
Here’s the gotcha nobody documents clearly: the usermod command updates the group membership, but the agent service process is already running under the old token — it doesn’t pick up the new group. Your jobs will fail with permission denied while trying to connect to the Docker daemon socket until you bounce the service specifically through svc.sh, not systemctl:
sudo ./svc.sh stop
sudo ./svc.sh start
# Verify the agent user can actually reach the socket now
sudo -u azureuser docker run --rm hello-world
That last verify step matters. Run it before you push any pipeline changes — otherwise you’ll be debugging a failed CI job instead of a 10-second local check. At this point your B1s or B2s VM is fully operational as a self-hosted agent, costing you somewhere between $7–15/month depending on region, with no per-minute billing eating into your budget every time a pipeline fires.
When B-Series Starts Hurting You (and What to Switch To)
The thing that catches most people off guard with B-series VMs is that the degradation isn’t a cliff — it’s a slow bleed. Your pipeline goes from 8 minutes to 12, then 18, then 35, and you spend two days convinced it’s a flaky test or a network issue before checking CPU credits. The moment I started running az monitor metrics list as part of my triage process, everything clicked.
# Check CPU credits remaining on your B-series VM
az monitor metrics list \
--resource "/subscriptions/YOUR_SUB_ID/resourceGroups/devops-lab-rg/providers/Microsoft.Compute/virtualMachines/devops-agent-01" \
--metric "CPU Credits Remaining" \
--interval PT1M \
--output table
# Watch it drain in real-time during a build — if you see it hitting single digits, you're starved
If you’re running multi-stage Docker builds with layer caching on a Standard_B2ms, the credit drain is brutal. Each build stage hits the CPU hard, burns credits fast, and by stage 3 of 5 you’re running at baseline performance (20% of a vCPU). The fix here isn’t clever tuning — it’s moving to Standard_D2s_v5, which gives you consistent 2 vCPUs at ~$0.096/hr in East US. No credit system, no surprises. I switched one of my Docker build agents to D2s_v5 and the same pipeline went from 34 minutes back to 9.
JVM tooling is a different problem entirely. SonarQube, Nexus Repository Manager, and anything running inside a JVM needs memory headroom for the GC to breathe. SonarQube’s official docs recommend 4GB RAM minimum for the Elasticsearch node alone — you haven’t even started the web server yet. Running any of that on B-series is the wrong call from day one. Standard_D4s_v3 gives you 4 vCPUs, 16GB RAM, and sits around $0.192/hr. That sounds like a lot until you compare it to the ops time you’ll burn debugging mystery slowdowns on an under-provisioned box.
The spot instance angle is worth taking seriously for ephemeral CI agents — Azure Spot VMs on D-series routinely price out at 60–80% below pay-as-you-go, which means a Standard_D2s_v5 spot agent can drop to around $0.02–0.04/hr. That math works when your pipeline spins agents up on demand and shuts them down after the job. It absolutely does not work for a persistent Jenkins controller, Nexus, or anything with state you can’t afford to lose on 30 seconds notice. I’ve seen people run a Jenkins controller on spot and lose their build history mid-release. Not fun.
# Auto-shutdown a dev VM at 7pm — this one change saves real money on forgotten lab VMs
az vm auto-shutdown \
-g devops-lab-rg \
-n devops-agent-01 \
--time 1900 \
--email [email protected]
# Confirm it's set
az vm show -g devops-lab-rg -n devops-agent-01 \
--query "id" -o tsv | xargs -I{} az rest \
--method get \
--uri "{}?api-version=2023-03-01" \
--query "properties.osProfile.computerName"
Auto-shutdown is the unglamorous cost control that actually works for dev/test agents that don’t need to run overnight. Set it once, forget it. The email notification fires before shutdown so if you’re mid-build at 6:55pm you can defer it. Combine that with spot pricing on D-series for ephemeral agents and you’ve got a setup where you’re paying B-series money for D-series hardware — which is ultimately the whole game when you’re optimizing Azure costs for DevOps workloads.
Honest Cost Comparison: B2ms vs. Alternatives
The number that surprises most teams: a Standard_B2ms running 24/7 pay-as-you-go in East US costs roughly $38–42/month as of mid-2025. That’s 2 vCPUs, 8 GB RAM, and burstable CPU credit behavior. Compare that to GitHub Actions’ per-minute billing and the math shifts fast once you’re past the free tier.
| VM / Runner | vCPU | RAM | ~Monthly (PAYG East US) | Best For | Biggest Gotcha |
|---|---|---|---|---|---|
| Standard_B2ms | 2 | 8 GB | ~$38–42 | Self-hosted runners, light CI agents, dev tooling | CPU credits drain under sustained load — you’ll notice on Docker builds |
| Standard_D2s_v5 | 2 | 8 GB | ~$70–75 | Consistent CPU workloads, production agents | No burstable benefit — you’re paying for baseline you may not always need |
| Standard_F2s_v2 | 2 | 4 GB | ~$62–67 | CPU-bound tasks where RAM isn’t the constraint | 4 GB is tight once you’re running Docker + a test suite simultaneously |
| GitHub-hosted runner | 2 | 7 GB | $0 (first 2,000 min) → $0.008/min after | Infrequent CI, open source projects, small teams | Zero persistent state — every run reinstalls your entire toolchain |
The GitHub-hosted runner math is where it gets concrete. You get 2,000 free minutes/month on the free tier — that sounds like a lot until you realize a mid-sized Node.js or Go project with Docker builds can burn 6–10 minutes per pipeline run. At 50 pipelines/day across a small team, you’re at roughly 10,000–15,000 minutes/month. Past the free tier, that’s $0.008/min on Linux runners, so the overage alone hits $64–$104/month before you’ve paid for a single VM. A B2ms running 24/7 beats that inside the first billing cycle, and you keep persistent Docker layer caches, pre-warmed toolchains, and a runner that doesn’t cold-start.
Reserved Instances change the B2ms number significantly. A 1-year reserved B2ms in East US drops to roughly $19–22/month — almost half the pay-as-you-go price. The 3-year reservation goes lower still, around $14–16/month. I only recommend going reserved if you’ve already been running the VM for at least 4–6 weeks and confirmed it’s not getting killed by credit exhaustion during your builds. Committing to 12 months on a VM that turns out to need upgrading to D2s_v5 for consistent CPU is an annoying mistake to undo. Azure does let you trade reserved instances, but it’s friction you don’t want.
The most overlooked lever here is Dev/Test subscription pricing. If your team has Visual Studio Professional or Enterprise subscriptions — which you likely already have if you’re in an enterprise Azure agreement — you qualify for DevTest subscription rates. The B2ms under a DevTest subscription can run closer to $15–18/month pay-as-you-go, no reservation required. I’ve watched multiple teams overpay for months because nobody connected the VS subscription to the Azure billing account. Check your subscription offer ID in the Azure portal: if it’s not MS-AZR-0148P (Dev/Test) and you have VS licenses, you’re leaving money on the table. Switching takes about 15 minutes and the savings compound across every non-production VM in the account.
# Quick way to check your current subscription offer type
az account show --query "{name:name, id:id, state:state}" -o table
# Then check the offer details (requires portal or billing API)
az billing account list --query "[].{name:name,displayName:displayName}" -o table
One thing that doesn’t show up in the pricing calculator: B2ms vs D2s_v5 for CI workloads isn’t just a cost comparison — it’s a reliability comparison. If your pipeline does sustained multi-core Docker builds for 8+ minutes, the B2ms will throttle once CPU credits run dry. The D2s_v5 won’t. For a mixed workload where half your pipelines are short linting/testing jobs and the other half are full image builds, you can split the difference: use B2ms as your default runner and spin up a D2s_v3 spot instance for the heavy image builds via Azure DevOps capability matching or GitHub Actions runner labels. That hybrid approach usually wins on both cost and build time without the commitment of paying D2s_v5 rates across the board.
Three Things That Surprised Me About Running DevOps Workloads on B-Series
The CPU burst credits get all the attention in Azure docs, but I spent the first week debugging what I thought was a compute bottleneck before realizing the real problem: disk I/O. A freshly provisioned Standard_B2ms comes with a Standard HDD as the OS disk by default. Not Standard SSD. A spinning-rust-equivalent managed disk. Every apt-get update, every Docker layer pull, every pip install — all of it is hammering a disk that does ~80 IOPS baseline. Switch to Premium SSD and the same operations run 3-4x faster without touching anything else. This is the single most impactful thing you can do on a B-series VM, and it’s not in the getting-started guide:
# Deallocate first, or this will fail
az vm deallocate --resource-group devops-lab-rg --name devops-agent-01
az vm update \
--resource-group devops-lab-rg \
--name devops-agent-01 \
--set storageProfile.osDisk.managedDisk.storageAccountType=Premium_LRS
az vm start --resource-group devops-lab-rg --name devops-agent-01
Premium SSD (P10, which is the default size for a 30GB OS disk) costs roughly $1.54/month versus $1.30 for Standard HDD. The delta is negligible. There’s no reason not to do this on every B-series VM you spin up for DevOps work. I now bake this into my Bicep templates so I never hit a Standard HDD again.
The burst credit model genuinely suits CI workloads well — probably better than any other VM category. A typical pipeline that wakes up, runs builds for 8-12 minutes, then sits idle for 45-50 minutes is exactly the pattern B-series hardware was tuned for. The Standard_B2ms accrues 24 CPU credits per hour at idle. A 10-minute burst at 100% burns roughly 20 credits. By the time your next build triggers, you’re back near full bank. I ran 15 consecutive hourly pipeline runs without ever hitting credit exhaustion — which would have happened immediately on a burstable VM with no recovery window. Where you get burned is sustained workloads: a 40-minute integration test suite that pegs both cores will drain the bank by minute 25 and throttle to 20% CPU for the rest. Know your build duration before committing to B-series.
The network cap surprised me more than the disk thing, honestly. Standard_B2ms is documented at 1500 Mbps, but the more painful limit is the expected bandwidth during a cold pull of a large image. Pulling something like mcr.microsoft.com/dotnet/sdk:8.0 (about 740MB compressed) from Docker Hub through the public internet on a cold agent takes noticeably longer than it should even with a “good” connection, because you’re fighting public internet routing plus the VM’s network ceiling. The fix that actually worked for me: push your base images to Azure Container Registry in the same region as the VM. Same-region ACR traffic doesn’t leave Microsoft’s backbone, and you’ll saturate the VM’s NIC much closer to its actual limit. The setup is a one-time cost:
# Create ACR in the same region as your VM
az acr create \
--resource-group devops-lab-rg \
--name devopsagentregistry \
--sku Basic \
--location eastus2
# Mirror your base image
az acr import \
--name devopsagentregistry \
--source mcr.microsoft.com/dotnet/sdk:8.0 \
--image dotnet/sdk:8.0
ACR Basic tier runs $0.167/day ($5/month roughly) and gives you 10GB of storage included. For a team running shared agents, that math works out immediately — you pull the same base image once per day across multiple pipelines instead of re-downloading it from Docker Hub on every cold agent start. Combine this with Docker layer caching on a persistent data disk (separate from the OS disk) and your pipeline startup time drops substantially on B-series hardware that would otherwise feel underpowered.
The ‘When to Pick What’ Decision Tree
Most people overthink this. After running CI/CD infrastructure on Azure across a few different teams, I’ve found the decision collapses into about six scenarios. Get the scenario right and the VM size picks itself.
Lightweight self-hosted agent — Terraform runs, Ansible playbooks, simple build steps with no Docker: Standard_B2ms is the right answer. 2 vCPUs, 8 GB RAM, Standard SSD (not Premium), and you must set up auto-shutdown. This box doesn’t need to run 24/7. Schedule the shutdown at 19:00 in the VM’s blade or via CLI:
# set auto-shutdown at 7pm UTC — adjust timezone as needed
az vm auto-shutdown \
--resource-group rg-devops \
--name agent-vm \
--time 1900 \
--email [email protected]
With auto-shutdown + Standard SSD, you’re looking at roughly $35–45/month depending on region. Premium SSD here is just money left on the table — Terraform state ops and Ansible SSH chatter don’t care about sub-millisecond disk latency.
Self-hosted agent doing Docker builds: This is where B2ms starts to sweat. If you’re running docker build infrequently — say, once or twice a day — stay on B2ms but switch to Premium SSD P10 (128 GB). The build cache hits disk constantly and Standard SSD latency will noticeably drag layer extraction. If builds are running more than 10 times a day, jump to Standard_D2s_v5. It’s more expensive (around $70/month), but you get consistent vCPU — no CPU credits to burn through mid-build causing your 4-minute Docker build to silently become a 12-minute one. That silent slowdown is the thing that catches teams off guard with B-series under sustained Docker load.
Persistent Jenkins controller or GitLab Runner coordinator: Don’t cheap out here. Standard_D4s_v3 minimum — 4 vCPUs, 16 GB RAM. The coordinator process itself is lightweight, but the moment you add a Postgres backend, plugins, and 5 concurrent job dispatches, a D2 will start swapping. If this box runs continuously for your team (it will), commit to a 1-year Reserved Instance. Azure’s RI pricing on D4s_v3 in East US cuts the effective hourly rate by roughly 36% vs pay-as-you-go. That’s a meaningful line item over 12 months.
Ephemeral agent that spins per pipeline run: Forget fixed VMs entirely. Azure Spot VMs on D-series inside a VM Scale Set with scale-to-zero is the right architecture. Spot pricing on Standard_D2s_v5 can drop to 20–30% of on-demand cost during off-peak hours. The eviction risk is real but manageable for CI — your pipeline just retries. The config that unlocks this properly is VMSS with evictionPolicy: Deallocate and a custom runner image baked via Packer. GitLab’s autoscaler and the Azure DevOps VMSS agent pool feature both support this natively.
# VMSS creation targeting Spot with fallback to Standard priority
az vmss create \
--resource-group rg-runners \
--name ephemeral-agents \
--image Ubuntu2204 \
--vm-sku Standard_D2s_v5 \
--priority Spot \
--eviction-policy Deallocate \
--single-placement-group false \
--instance-count 0
Dev/test scratch box for a single engineer: Standard_B1ms (1 vCPU, 2 GB) or Standard_B2s (2 vCPU, 4 GB). Auto-shutdown at 19:00 local time, every day, no exceptions. You will forget to shut it down on a Friday. The CPU limits don’t matter for ad-hoc testing — you’re not running sustained load, you’re poking at configs. B1ms runs under $15/month with reasonable uptime. B2s gets you to about $30. Neither is worth agonizing over.
You just want pipelines to work and don’t want to manage VMs: Use Microsoft-hosted agents in Azure DevOps. The ubuntu-22.04 image is well-maintained, Docker is pre-installed, and you pay ~$0.008/minute (roughly $0.48/hour). For a team running 2–3 hours of CI per day, that’s under $50/month. The trade-off is real though: you get no caching between runs (unless you wire up pipeline cache tasks), cold starts every time, and zero ability to pre-install tooling. The moment your pipeline spends 3 minutes installing the same tools on every run, you’ve crossed into territory where a self-hosted B2ms pays for itself in a month.
Quick Config: cloud-init to Bootstrap Your Agent on First Boot
The thing that will burn you first is provisioning a cheap VM and then manually SSH-ing into it every time to set up the agent. I made that mistake across three environments before I started passing --custom-data to az vm create. One command, and the machine bootstraps itself completely.
az vm create \
--resource-group devops-rg \
--name ado-agent-01 \
--image Ubuntu2204 \
--size Standard_B2s \
--admin-username azureuser \
--ssh-key-values ~/.ssh/id_rsa.pub \
--custom-data @cloud-init.yaml \
--output table
The @ prefix tells the CLI to read from a file rather than treat the string literally. Here’s a cloud-init config that actually works — installs Docker, git, and the Azure CLI, creates the agent directory, and drops in a setup script that will register the agent on first boot:
#cloud-config
packages:
- git
- curl
- jq
- unzip
package_update: true
package_upgrade: false # skip full upgrade — keeps first boot under 3 minutes
runcmd:
# Install Azure CLI
- curl -sL https://aka.ms/InstallAzureCLIDeb | bash
# Install Docker without interactive prompts
- curl -fsSL https://get.docker.com | sh
- usermod -aG docker azureuser
# Create agent working directory
- mkdir -p /opt/azure-pipelines-agent
- chown azureuser:azureuser /opt/azure-pipelines-agent
# Download the agent binary (pin the version — don't use "latest" in automation)
- curl -sL https://vstsagentpackage.azureedge.net/agent/3.236.1/vsts-agent-linux-x64-3.236.1.tar.gz \
-o /tmp/agent.tar.gz
- tar -xzf /tmp/agent.tar.gz -C /opt/azure-pipelines-agent
# Fetch PAT from Key Vault — requires the VM to have a system-assigned identity
- |
PAT=$(az keyvault secret show \
--vault-name my-devops-kv \
--name ado-agent-pat \
--query value -o tsv)
sudo -u azureuser /opt/azure-pipelines-agent/config.sh \
--unattended \
--url https://dev.azure.com/my-org \
--auth pat \
--token "$PAT" \
--pool Default \
--agent "$(hostname)" \
--work /opt/agent-work \
--acceptTeeEula
# Install and start as a systemd service
- /opt/azure-pipelines-agent/svc.sh install azureuser
- /opt/azure-pipelines-agent/svc.sh start
write_files:
- path: /opt/azure-pipelines-agent/.env
owner: azureuser:azureuser
permissions: '0600'
content: |
AGENT_WORK_FOLDER=/opt/agent-work
# PAT is pulled at runtime from Key Vault, not stored here
The docs-vs-reality gap: when cloud-init fails, Azure surfaces nothing in the portal. No error, no warning — the VM shows as “Running” and you’re left wondering why no agent appeared in your pool. The output you actually want is on the machine itself at /var/log/cloud-init-output.log. SSH in and tail it. I’ve seen silent failures from a missing Key Vault role assignment (VM identity needs Key Vault Secrets User on the vault) and from package mirrors timing out mid-install. Both looked identical from the outside: agent just never showed up.
# After provisioning, give it 4-5 minutes then check:
ssh azureuser@<vm-ip> "sudo tail -100 /var/log/cloud-init-output.log"
# If the agent service didn't start, check systemd too:
ssh azureuser@<vm-ip> "sudo systemctl status vsts.agent.*"
On the write_files approach for config: drop credentials here only if they’re non-sensitive defaults. For anything secret — PAT, service principal credentials, webhook tokens — use the runcmd block to pull from Key Vault at boot time, like the snippet above does. The write_files block is good for dropping agent capability files, custom proxy settings, or a .gitconfig for the agent user so it can authenticate to your repos without extra config later. Files written here are in the VM’s cloud-init metadata briefly, so hardcoding a PAT there is the kind of thing that shows up in a security audit six months later.