Warner Music Migrates Its Cloud Foundry Deployment from OpenStack to AWS
WMG’s Adam Chesterton and Altoros’s Renat Khasanshyn discussed the move during a session at the Cloud Foundry Summit 2016 in Santa Clara, CA.
Issues faced with a hybrid cloud
Warner Music Group (WMG) started using Cloud Foundry in 2011, and they’re “still one of the largest organizations using the community version for production workloads,” according to Adam Chesterton. Apps were deployed over BOSH onto OpenStack, with WMG beginning its journey with multiple cloud providers, some public and some private.
“Then something happened,” Adam said, in getting workloads to the OpenStack environment. “We managed to crack the way to rapidly build, test, and deploy code into Cloud Foundry, but we started to experience some issues.”
Renat outlined several of the issues that arose:
- Environments were not as reliable as hoped.
- Inconsistency of the OpenStack API into BOSH. “We would request 10 virtual machines, for example, but sometimes get two or none at all.”
- Random, inconsistent timeouts were occurring. HTTP calls being made and acknowledged, but nothing returned.
- Couldn’t get access to hardware logs (the managed OpenStack provider could not provide them), which lead to prolonged troubleshooting and discovery. So, it was very unclear where the problems lay, and the team needed full end-to-end visibility.
- Renat also noted that Cloud Foundry allows users to set up identical environments at the application level. “But the environments were not the same,” he said. “As we moved code from development to production, we found that we weren’t testing the same thing.”
“The key problem that we had was that we could not get consistent results.”
—Renat Khasanshyn, CEO, Altoros
“So, we needed to go one way or another,” Adam added. “We did not have a compelling reason to stick with our model.” In contrast, referring to recent improvements in the AWS offering, Andrew noted that security awareness was growing, pricing models improved since WMG had implemented OpenStack, and AWS reliability also got better.
Hit the reset button
The team started with a comparison of public vs. private cloud options. The business issue for WMG became, “do you become over-cautious about deliverables, which goes against what Cloud Foundry can promise? Or do you take a risk (with the potential to create) a sticky situation with your business holders?”
Adam and the team decided to take the risk, confident that the Cloud Foundry platform could deliver for them on AWS. They took several next steps to make the migration work:
- Developed a single, consistent strategy by working with both internal and AWS architects.
- Focused on automation. “We built out a foundation, then used BOSH on top of it to deploy the apps and to do so across all of our environments.”
- Got recommendations from security partners. “We found there were many options in this space. This was a really big factor.”
- Migrated from an EC2 dev environment, built everything from scratch.
- Developed upgrade path for eventual migration to Cloud Foundry Diego.
- Building resiliency for a multi-zone model.
By then committing fully to AWS and being judicious in its use of reserved and spot instances to control costs, WMG was soon able to keep its promises to its line-of-business users and strengthen its SLAs, Adam said. “We did a pause and a reset. It was becoming hard to predict the unknown, and the hybrid environment was impacting us setting our deadlines for business.”
With the migration, “being able to promise lines of business when they can get their apps into the environment, and what kind of service levels they can expect, is a giant advantage when operating with multiple consumers,” Renat said. “As a result, the price we were willing to pay to get that expectation of availability was much higher than we were originally willing to pay.”
In the end, “We accomplished what we wanted to accomplish,” he said. “We removed random unknowns, so we can be more accurate in setting business expectations.”
Lessons learned
Neverthelss, the team found out that not all of the assumptions appeared to be true during the migration. For instance, “there are actual physical capacity limits on compute resources that you can use without having to talk to people. With a private cloud, the elasticity you have paid for is guaranteed. However, time to provision and terms for new capacity can be significantly longer than in a public cloud.”
“Elasticity is not guaranteed in a public cloud.” —Renat Khasanshyn, CEO, Altoros
For their Cloud Foundry Summit session, Adam and Renat have put together a data sheet covering the major issues that arose on the way to using AWS public cloud on Cloud Foundry.
The sheet provides an overview of 5 successful processes (or patterns) and their sibling 5 anti-patterns. In addition, it contains a Top 10 tips list that are related to detailed problems and solutions the team faced and solved.
Visit this page to get the document.
Related reading
- Warner Music Builds a Software Factory with Cloud Foundry
- Digital Transformation Using Cloud Foundry—What Works and What Doesn’t
- Patterns and Anti-Patterns when Operating Cloud Foundry in Public and Private Clouds
Want details? Watch the video!
Table of contents
|
Related slides
About the speakers