Most cloud migration content is written by people selling cloud migrations. This one isn’t. We’ve moved enough production workloads from co-located racks and VMware clusters into AWS, Azure, and GCP to know exactly where the marketing-page math falls apart — and where the cloud genuinely earns its keep.
If you’re a CTO or engineering lead staring at a migration plan that promises “30% cost savings in 6 months,” read this first. The savings can be real. The timeline almost never is.
The 6 Rs framework — and when each is actually right
Gartner’s 6 Rs framework is the only mental model you need for triage. The problem isn’t the framework, it’s that teams pick a strategy before they finish the inventory. Don’t do that.
Rehost (lift and shift). Move the VM as-is. EC2, Azure VM, GCE. You get zero cloud-native benefits and roughly the same operational cost — but you exit the data center fast. Use this when your lease is up in 90 days, when the app is legacy enough that nobody dares touch it, or when leadership needs a quick win to fund the harder work. Expect a 10-15% cost increase in year one before optimization.
Replatform (lift, tinker, shift). Same architecture, but you swap self-managed pieces for managed equivalents. Move MySQL from an EC2 instance to RDS. Move Redis to ElastiCache. This is the sweet spot for most mid-size apps — you cut operational burden without rewriting code. Costs go up nominally, but you get back engineer-hours.
Refactor (re-architect). Break the monolith into services, adopt containers or serverless, redesign the data layer. High effort, high upside, high risk. Only do this when the app is strategic and the current architecture is provably the bottleneck. Read our take on the monolith vs microservices decision before you commit — most teams refactor too early.
Repurchase (drop and shop). Your homegrown CRM gets replaced by HubSpot. Your internal ticketing system becomes Linear. This is almost always the right call for non-differentiating software, but politically the hardest because someone built the original.
Retire. Audit your VMs. We routinely find 15-30% running zero traffic, billed monthly, owned by an employee who left in 2021. Kill them before you migrate them.
Retain. Some workloads stay on-prem. Mainframes with COBOL nobody understands. Systems under data residency law. Latency-sensitive industrial control. Hybrid is fine — it’s not failure.
The right answer is almost always a mix: rehost 40%, replatform 35%, refactor 10%, retire 10%, repurchase 5%. Beware any consultant whose slide deck refactors everything.
Cost reality: what the cloud actually costs
The provider pricing pages are technically accurate and practically useless. Here’s the math that actually shows up on your invoice.
Compute. An on-demand m6i.large in us-east-1 is roughly $70/month. With a 3-year reserved instance and all-upfront payment, you’ll pay around $30/month — but you’ve just signed a 3-year commitment that erases half the “elasticity” you migrated for. Spot instances cut another 60-70% if your workload tolerates interruption.
Storage. AWS S3 Standard is $0.023/GB/month. Sounds free until you hit a few hundred TB. S3 Glacier Deep Archive is $0.00099/GB/month — 23x cheaper, but retrievals take hours and cost extra per GB. Tier ruthlessly.
Databases. This is where bills explode. RDS Multi-AZ doubles your instance cost for the standby replica. A db.r6g.xlarge Multi-AZ Postgres with 500GB gp3 and backups runs ~$650/month before any read replicas, IOPS provisioning, or Performance Insights. Aurora is often cheaper at scale but more expensive at small scale — model both. See our database optimization guide for what to fix before you migrate, not after.
The silent killer: network egress. Outbound data to the internet from AWS costs $0.09/GB after the first 100GB free. Move 50TB/month and you’re paying $4,500/month just to talk to your own users. Cross-AZ traffic is $0.01/GB each way — innocuous until your chatty microservices are flinging 10TB/day across availability zones. Inter-region replication adds another layer. Architect with egress in mind from day one, or bring a CDN (CloudFront, Cloudflare) into the design before launch.
The invisible costs. NAT Gateway: $0.045/hr + $0.045/GB processed. Load balancers: ~$20/month each, and you’ll have more than you expect. CloudWatch logs at scale will surprise you — we’ve seen $15K/month bills from one team that left DEBUG logging on in production.
A rule of thumb: take the cloud provider’s TCO calculator output and multiply by 1.4. That’s your real year-one bill.
Picking the right pilot
Don’t pilot the auth service. Don’t pilot the billing pipeline. Pick a workload with three properties:
- Real traffic — synthetic load proves nothing.
- Bounded blast radius — if it breaks for two hours, the company doesn’t.
- Honest dependencies — you’ll uncover the integration surprises here, where they’re cheap.
Internal admin tools, analytics dashboards, marketing site infra, and batch reporting jobs are usually ideal first migrations. A good pilot takes 6-10 weeks and produces a runbook, a cost model, and a list of every assumption that turned out to be wrong. That list is more valuable than the migration itself.
DevOps and CI/CD: the work nobody scopes
Your existing Jenkins-on-a-VM pipeline that does scp to a production box does not survive contact with the cloud. Migration forces a CI/CD modernization whether you planned for it or not.
You’ll need infrastructure-as-code (Terraform, Pulumi, or CDK — pick one and commit), a container registry, secrets management (don’t put secrets in environment variables in your IaC repo), centralized logging, and an alerting strategy. Plan a parallel workstream just for this. Our DevOps best practices covers what good looks like — the short version: if a deploy takes more than 15 minutes from merge to production, you have technical debt that will compound the moment you’re in the cloud.
Also: branching environments. The cloud makes ephemeral preview environments per PR genuinely cheap and easy. If your team isn’t using them by month 6 post-migration, you’re leaving the best part of cloud on the table.
The shared responsibility model — and why your SOC 2 audit will fail otherwise
AWS, Azure, and GCP all publish a shared responsibility model. Read it. The provider secures the cloud; you secure what runs in the cloud. Misreading this is the single most common source of cloud security incidents.
The provider handles physical security, hypervisor patching, and the network backbone. You handle: IAM policies, security group rules, OS patching on your VMs, application vulnerabilities, encryption key management, S3 bucket permissions (still the #1 cause of public data leaks), and audit logging configuration. A default-deny posture, least-privilege IAM, and CloudTrail/Azure Activity Log enabled from day one are non-negotiable. For application-layer concerns, our guide on secure web applications goes deeper.
External reference worth bookmarking: the AWS Well-Architected Framework — provider-specific but the security and reliability pillars apply universally.
A realistic timeline
Vendors will quote you 4 months. Plan for 8-14.
For a 50-person SaaS with ~30 services, a primary Postgres database, a handful of background workers, and the usual collection of internal tools, here’s what we typically see end-to-end:
- Months 1-2: Inventory, dependency mapping, TCO modeling, 6 Rs classification per workload.
- Months 2-4: Pilot migration, IaC foundation, networking design (VPC, subnets, peering, transit gateway if multi-account), identity federation.
- Months 4-9: Wave-based migration of stateless services first, then stateful. Database cutover is its own month, minimum.
- Months 9-12: Decommissioning on-prem, right-sizing, reserved instance purchases, cost optimization.
- Months 12-14: The “we forgot about that” tail — that one service nobody mentioned, the SFTP integration with a partner, the cron job on someone’s laptop.
Skip phases at your peril. The teams that hit 6-month timelines either had a trivially simple stack to begin with, or they’re still paying down the shortcuts two years later.
What success actually looks like
A year after the migration, the metrics that matter aren’t “we’re in the cloud.” They’re: deploy frequency up 3-5x, mean time to recovery cut in half, new environment provisioning down from weeks to minutes, and infrastructure spend that’s predictable — not necessarily lower in raw dollars, but tied to revenue in a way it never was before. If you don’t get those, you rehosted and stopped.
Need migration support?
We’ve taken teams from racks to cloud, from cloud back to hybrid, and from one provider to another when the bill stopped making sense. If you’re scoping a migration, modeling TCO, or trying to figure out which workload to move first, get in touch. We’ll tell you what we’d actually do — including when the answer is “don’t migrate yet.”