# How we switched from Redis to Valkey, and nobody cared!

**Date:** 2026-06-22  
**Author:** Kees C. Bakker  
**Categories:** Projects  
**Tags:** How It's Made  
**Original:** https://keestalkstech.com/how-we-switched-from-redis-to-valkey-and-nobody-cared/

![How we switched from Redis to Valkey, and nobody cared!](https://keestalkstech.com/wp-content/uploads/2026/03/Redis-vs-Valkey.jpg)

---

On the platform team, we have a running gag: "we've been running serverless since 2013!". Ever since we started our move from the on-premises data center to the cloud, delivery teams haven't had to worry about their infrastructure. We've built a platform that allows delivery teams to *focus on solving business problems*. They don't have to worry about Kubernetes, network infrastructure, or writing Terraform code to provision data stores. They don't even need to know about AWS! Everything in our stack is easy to provision through our Slack bot.

But having such a platform comes with a peculiar downside: when people don't *have to care* about infrastructure, they usually... don't! *But what if we want to migrate our AWS ElastiCache Redis stores to Valkey?* In this article we'll explain how we've switched 150+ AWS ElastiCache instances from Redis to Valkey.

A big shout-out to [Klaas Talsma](https://www.linkedin.com/in/klaastalsma/) and [Chris Vahl](https://www.linkedin.com/in/chris-v-4b9a62b/) for working together on this story.

Historical note: we've started this migration at the beginning of the year, so this article was long overdue.

## Why?

Back in 2024, Redis switched to another license, and that got developers around the world thinking about a fork. Valkey was born in no time! [AWS was heavily involved from the start](https://aws.amazon.com/blogs/opensource/why-aws-supports-valkey/). Honestly, the results were impressive. And they didn't stop there, as Valkey 8.1 was released with [even more power](https://aws.amazon.com/blogs/database/year-one-of-valkey-open-source-innovations-and-elasticache-version-8-1-for-valkey/#:~:text=Innovations%20that%20matter).

When we read it was [20% cheaper to run Valkey](https://aws.amazon.com/blogs/database/get-started-with-amazon-elasticache-for-valkey/#:~:text=On%20ElastiCache%20for%20Valkey%20self%2Ddesigned%20(node%2Dbased)%20clusters%2C%20you%20can%20benefit%20from%20up%20to%2020%25%20lower%20cost%20compared%20to%20other%20engines.) than Redis—with even better performance—the decision was a no-brainer. So, we threw together some slides, gave a Tech Talk, and started working on adding Valkey to the platform.

![](https://keestalkstech.com/wp-content/uploads/2026/03/TechTalks-Valkey-Provisioning-slide-3.jpg)
*And this was only the beginning, as these slides are from early 2025.*

![](https://keestalkstech.com/wp-content/uploads/2026/03/TechTalks-Valkey-Provisioning-slide-4-2.png)
*To sum up: it is faster &amp;&amp; cheaper =&gt; let's migrate.*

## First, stop the bleeding

Whenever we introduce new technology, we try to work according to two steps:

1. **Stop the bleeding**  
   Make sure *newly* provisioned infrastructure uses the new technology by *default*.
2. **Migration**  
   Present a migration path and let teams move at their own pace.

Our delivery teams don't have to use Terraform (they can if they want to); we provide a Slack bot they can use to provision data sources. We swapped out the provisioning of Redis resources for that of Valkey. In the beginning, we even kept the Redis icon, but over time that got swapped as well, as the name caused some confusion.

![](https://keestalkstech.com/wp-content/uploads/2026/03/provision-valkey-dialog.png)
*The Provisioner provides an easy dialog for Valkey provisioning. In an earlier version you could select Redis or Valkey engines.*

Our Provisioner provisions Valkey in AWS ElastiCache according to Tech Hub standards, such as TLS, authentication, and multi-AZ. It generates the Terraform, commits it to the IaC Git repository, creates a PR, and watches it roll out to production, all while keeping the user informed on Slack.

## Cue the delivery teams?

Next stop: tell the delivery teams! Again, we made some slides:

![](https://keestalkstech.com/wp-content/uploads/2026/03/TechTalks-Valkey-Provisioning-slide-14.png)
*We made TLS mandatory for our new Valkey setups, which might need an application change.*

The usual suspects moved right away, but the bulk of the teams did not migrate 😱. Only their new stores were Valkey, but they did not move their existing stores. So, we talked to the teams to find the main blocker:

- Some older services did not use AUTH or a TLS connection, and the teams did not want to refactor those applications to include both features.
- Some teams still had their service on a *single fascia* environment, so they needed to move to a multi-fascia environment before they could benefit from the new setup.
- Some teams had very busy backlogs, so they couldn't prioritize it.

## Now what?

So, what could we, as a platform team, do? If we wanted to reap the performance and financial benefits, we needed to make an in-place upgrade possible. And let's face it: Valkey is meant as a drop-in replacement.

However, we hit a technical hurdle: our latest infrastructure standards mandate encryption and TLS, but applying those specific enhancements to existing environments would have made an in-place engine upgrade impossible due to AWS version and engine-name restrictions and a blue-green migration would have been needed.

To bridge this gap, the team divided and conquered. Part of the team drove the automation, ensuring our Slack bot and Provisioner could handle the new Valkey logic seamlessly. Simultaneously, a dedicated upgrade path was engineered by rewriting an earlier version of our module that didn't yet include the newer security enforcements.

By testing this extensively on clusters with at least two instances, we confirmed we could swap the engine under the hood with zero downtime. The result was a process so streamlined that teams could migrate to Valkey by updating just a few lines of code:

![](https://keestalkstech.com/wp-content/uploads/2026/03/image-3.png)
*Only 4 lines need to be changed.*

## A New Year's Resolution

The next step was to talk to [Koen Roumen, Head of Technology](https://www.linkedin.com/in/kmroumen/) for the Store teams. We explained that there was no reason not to move, and we came up with a plan: *either* the teams move *before* a certain date, or *we would make the move for them.* Maybe it was the optimism that comes with starting a new year, or maybe the stars aligned, but we all agreed on a pretty tight timeline.

![](https://keestalkstech.com/wp-content/uploads/2026/03/Shop-Valkey-Migration-slide-5.png)
*We wanted to move within the month.*

Again, some teams woke up and moved, but most teams were like: *if you're sure there is no impact, go for it!* And so we did. On the morning of the 27th, we migrated the remaining 47 Redis instances for 6 delivery teams across 3 AWS development environments to Valkey. It went incredibly smooth. Yes, it took AWS a while to provision the stores (no idea why), but by 9 o'clock, we were done. Our delivery teams did not report any problems with the result.

Production was scheduled for a week later. As dev went so smooth, we dropped our requirement to send a backend engineer to the call, and we decided not to begin at 08:00, but at 09:00. Again: one smooth transition.

## Right sizing

Now that the migration to Valkey was behind us, we’ve shifted our focus from "making it work" to "making it right". Because our in-place migration kept the exact same instance types that were previously running Redis, we now find ourselves in a position where much of our fleet is likely over-provisioned. Valkey’s improved performance and smaller footprint mean that the "safety margins" teams originally selected for Redis may now be unnecessary overhead.

To address this, we’ve started a deep-dive analysis leveraging the power of LLMs and the AWS CLI to parse through two weeks of CloudWatch metrics since migration to Valkey. This data-driven approach has already identified significant 'ghost capacity' across our fleet. By identifying these oversized clusters now, we can rightsize our infrastructure before committing to new long-term reservations. It’s a pragmatic approach to cloud spending: optimize the baseline first, so we only pay for the performance we actually use.

In the end, that's the beauty of this setup: 155 instances migrated, zero downtime, and zero complaints. We've switched the engine, saved the money, and—just as we intended—nobody even noticed.

## So what did we learn?

A lot!

1. Valkey is a great replacement of Redis.
2. Our delivery teams are very busy, and that's great!
3. That should not stop us as a platform team from moving forward.
4. With infrastructure as code (Terraform) and AWS, we, as the platform team, can still help teams move forward.
