The Client is a Fortune 1000 networking technology company with cutting edge products for high-performance networks and an equipment supplier for Internet and Telecom Service Providers. The company has grown significantly in the past 10 years and has been championing SDN and NFV implementations and is aiming to be a key player in this space.
In addition to building its own private cloud solution on top of Open Stack, the company is in the process of migrating enterprise applications and IT workloads to AWS. As part of its data center consolidation, the client is implementing a hybrid cloud solution with N+1 redundancy architecture.
Current System overview:
The Data Centers have 1000+ VMs on a VMWare server farm that runs .NET and Java applications on Windows and Linux environments. About 160 of the productivity applications must be migrated to AWS after due diligence on Capacity / IOPS requirements. In the first phase, about 50 Java applications will need to be migrated to AWS in 9 months’ time. By Dec 2019, all the remaining applications will be migrated and one of the DCs will be shut down.
The SAP and in-house engineering applications would be migrating to a private cloud built on OpenStack. An orchestration layer is built to provision VMs as part of the private cloud and this is different from the AWS environment for the office productivity applications.
Key Business/Systems requirements:
“Consolidate the 3 DCs into two DCs for Private Cloud infrastructure and migrate all the productivity applications to AWS.”The Client has enumerated the following requirements for mandatory compliance while designing an approach to migration. Ensure migration of 160 productivity applications and VMs to AWS in less than 15 months. Automate the process of creating infrastructure using Ansible scripts and preserver the security settings in its entirety. Develop a self-service portal to streamline the provisioning of resources for the Hybrid Cloud deployment – reduce lead time from 2 days to few hours Monitor the applications for uptime and performance in both the private and public cloud environments
The primary objective was to migrate, Java applications from physical Data centers to optimize TCO and utilization of resources so that the scale can happen as the business grew. The premise applications were robust and must perform exactly in the same fashion in the public cloud. Some of the key technical considerations were related to end automation of the different environments: Development/UAT and Production. Continuous monitoring of applications on a 7X24 basis with centralized logging system must be implemented.
The client after a detailed assessment of the efficacy of modernization and its impact on the migration timelines and investments and decided to adopt a “lift and shift” approach. The diversity of tech stack, time pressures and the complexity of licensing of COTS, the decision to use a model of IaaS looked prudent and manageable.
With a deep understanding of the AWS, the client needed the validation of manageability aspects and sizing assumptions from Newt Global as a neutral consultant. Migration of key workloads was done after the appropriate infrastructure was provisioned through automation scripts
Choice of Migration approach
The proposed approach was to use DevOps led migration strategy so that the end to end automation of creating different environments is streamlined as part of the migration process. Some of the salient points of the proposed architecture (ref. diagram) are:
- Client Specific Global AWS AMI for SQL.
- End to End Pipeline using Cloud Formation, Ansible and Shell Scripting
- AMI backup/retention, DB backup/retention, App related files to S3
- S/W’s & App Configuration Backup in AWS Snapshot.
- AWS Cleanup for Unused Volumes, AMI’s, Instance, Snapshot to avoid Cost
- Inventory Management
The project started in the middle of 2017 and is expected to complete by 2019 when the DC will be retired in favor of AWS infrastructure with production servers. All the 50 applications are running on AWS infrastructure and the migration process has been smooth.
As of now, all the 400+ VMS running 50 applications have moved to the AWS environment with most of the solution considerations having been met. There is a process of cost optimization that is being done, after tweaking performance issues over the past 5 months.
The start date was October 1st, 2017 and the projected end date for phase 1 is July 31st, 2018. This includes migration and setting-up a NOC in India for monitoring and supporting the applications globally.
Apart from the business and IT stakeholders of the company who played an active role in the planning phase, the migration tasks were completed by a team of 3 people led by a seasoned architect. All the members of the team were AWS certified at Professional level and had 7 years of Data Center, Virtualization and Cloud experience on an average.
Use of DevOps Tools
Jira, Confluence, Artifactory, and Jenkins were used in the process of migration extensively.
Validation – Project closure (Lessons learned)
The project team worked closely with the enterprise IT team at the client premise and resolved some of the issues that impacted delivery. The client is a large enterprise and the application is business critical
The team faced certain challenges due to:
- Certification issues,
- Roll-back due to version conflicts on COTS
- Software installation/Configuration issues,
- SSO Issues.
The current phase of implementation was completed in March 2018, but there is further work being done to optimize the deployment.
Operation and Optimization
Some of the key areas for focus in the operations phase are:
- Reserved Instance usage
- Horizontal scaling based on OS resources
- AWS health-checkup to clean-up instances, AMIs, volumes, snapshots
- Schedule stop/start instances based on triggers
The production deployment would consider DR implementation on AWS, as the current configured like AWS as DR for the DC.
The typical deployment architecture is given below:
AWS Tools used
The project team had agreed at the planning phase to use a set of third-party tools and the AWS services to complete the migration. They are:
- VPC, EC2, S3, CloudWatch, Autoscaling, ELB, RDS, CloudTrail and IAM.
- Nagios is being used for monitoring the infrastructure and New Relic was configured for monitoring applications.
Security considerations and implementation
The focus on security was present right through the process of migration and the enterprise has clear guidelines around this and the project team implemented them.
The policy guidelines on security were centered around:
- Application access from browser
- Data archive access and auto-scaling policies
- AWS Direct connect – Network access
- LDAP Directory
- Windows AMI
- Compliance access
- HR Management System
The deployment process included the creation of VPC and SSH keys on top of IAM, creation of IAM Groups and Roles for EC2 access and S3 access. In the next phase, there will be an implementation of identity federation due to the potential creation of multiple AWS accounts. The company’s standard for data encryption was not implemented for data in motion as the applications were not mission critical and the data privacy requirements were marginal.
Newt ended up implementing the following in the production deployment.
- Hardened Image provided by Enterprise AWS AMI Team
- Enabled with SSO login
- VPC design, security groups, network access control were implemented with the help of the IT team at the client place.
- Application specific IAM Policy which applies in EC2, S3; TLS Enabled from the application environment.
Business benefits of the migration:
The migration project is envisaged to give the following benefits to the client in a virtual data center environment as and when the physical DC is shut down
- Deployments to on AWS were streamlined through automation and timeline for migration was reduced by 30%.
- The Synchronization between Development and Deployment was improved by over 50% in the case of Private Cloud implementation.
- Continuous monitoring of the applications and Infra-structure open with source and AWS services like CloudWatch, dramatically improved remediation times in the event of outages, peaks in utilization, etc.,