Torrent Pharma

On-premises Disaster Recovery  to AWS

Torrent Pharma Disaster Recovery

Torrent Pharma, the flagship Company of Torrent Group, with a turnover of Rs. 8005 Cr is one of the leading Pharma companies in the Country. We are the pioneers in initiating the concept of niche marketing in India and today are ranked amongst the leaders in the therapeutic segment of cardiovascular (CV), central nervous system (CNS), gastrointestinal (GI), and women’s healthcare (WHC). The Company also has a significant presence in diabetology, pain management, gynecology, oncology, and anti-infective segments.
Torrent Pharma has crossed many geographical boundaries with a presence in more than 40 countries. The Company is ranked first amongst Indian Companies for having the largest market share in Brazil and Germany.

Challanges

  • Sales Force Automation Application and Database servers are installed/Collocated at ATMA House Ashram Road Ahmedabad Sify Datacenter. And more than 7000 field staff are accessing this application over the internet. Below is the schematic diagram/ Existing Server configuration.
  • It is one of the most important applications that should have DR, can not tolerate downtime which leads to heavy financial losses and equally bad effect on the company’s reputation.
  • The company wants to set up Application and Database servers for Disaster Recovery (DR) on the cloud. Customers want on-premise to AWS Cloud sync over the Internet (no dedicated line / site-to-site VPN). Customers also want an approximate cost if the DR drill is done once every three months for 8 hours. Customers would like to have a seamless experience when a disaster happens and so need dedicated public IP from the cloud, which can be mapped to DNS as a failover IP.
  • To achieve near real-time RPO and 1 hour RTO for App and DB servers.

Solution to meet challenges

  • Setup CloudEndure Account and Project. This is a one-time step whereby the customer’s CloudEndure account is set up by registering for Disaster Recovery. Once signed in, a new Project needs to be created in CloudEndure “User Console”.
  • Define Replication settings and Target Cloud environment Blueprint. Before we start replicating the machines, we need to define replication settings and blueprint under the newly setup project. Here, you define your Source (2 servers) and Target environments, and the default Replication Servers in the Staging Area of the Target infrastructure.
  • Network configuration at either side to be configured with source public IP and cloud allocated EIP (Elastic IP) so that even over the Internet traffic is allowed only by whitelisted IPs. CloudEndure shall by default encrypt all traffic between on-premise and AWS cloud.
  • Add machines by installing the CloudEndure Agent on each source machine. Agents can be installed on both Linux and Windows machines. Detailed instructions can be referred to here. CloudEndure does not require a separate physical/virtual machine for setting up sync between on-premises and cloud
    server.
  • Once agents are installed and set up under CloudEndure “UserConsole”, these machines shall start replicating the initial full data transfer. It shall take hours to days to complete initial replication depending upon internet bandwidth between on-premise and AWS Cloud. Once the initial sync is completed, CloudEndure only syncs incremental changed block data storage. CloudEndure also compresses data and optimizes outgoing internet bandwidth and cost.
Architecture-of-CloudEndure-Disaster-Recovery

Failover

  • Go to CloudEndure Console and under “Launch Target”, you can initiate “Test Mode” for DR Drill and “Recovery Mode” for actual Disaster Event. Under “Test Mode” replication between on-premise and AWS Cloud, is not interrupted.
  • CloudEndure takes regular snapshots on the cloud to support “Point in time” recovery. This is very useful in case of Ransomware/data corruption at the source which also would have impacted cloud-synced data. During “Launch Target” you can choose the respective snapshot which you would like the respective machine to use (default will be the latest snapshot with recently synced data).
  • After Target instances are launched (within minutes), please verify each launched instance through Remote Desktop (RDP). Here, the newly launched DB machine can be accessed through Application Server.
  • Assuming cloud public IP (EIP) is already configured as failover IP in DNS configuration, you can verify the application should be up and running under the same domain URL.

Failback

In a Disaster Event scenario, once the on-premise infrastructure is recovered, you can perform failback steps to sync in reverse order to get your original infrastructure in a data sync state.

  • Under CloudEndure Console, Click on the “Project Actions” menu and select “Prepare for Failback”. The project will go into Preparing for failback to original Source status.
  • CloudEndure needs a Failback client to be installed and configured to boot from the original source machine to start to reverse sync from AWS Cloud to respective on-premise servers. Failback Client (failback_client.iso) can be downloaded from the Replication Settings section in the CloudEndure User Console under Setup & Info.
  • Once all machines are recovered at a premise location, perform complete verification steps and route application traffic through DNS.
  • Go to the CloudEndure console under “Project Actions” and click “Return to Normal Operation”, this will again start the normal replication process which was going on before initiating the Disaster event.
  • Delete Target Machines used during Disaster Recovery by going to CloudEndure console and under “Machine Actions” click “Delete Target Machines.

Outcomes

  • The Recovery Point Objective (RPO) of CloudEndure is typically in the sub-second range. Near real-time RPO is achieved and tested by customer successfully.
  • The Recovery Time Objective (RTO) of CloudEndure is typically measured in minutes. The RTO is highly dependent on the OS boot time. As per the client requirement, it was achieved and converted to automation for speedy recovery at the time of DR.
  • CloudEndure is very secure for data in transit and data at rest, in both cases it is encrypted. CloudEndure stores only configuration and log data on the CloudEndure Service Manager’s encrypted database. Replicated data is always stored on the customer’s own cloud VPC. The replicated data is encrypted in transit.
  • CloudEndure supports one of the most important features and it is the customer needs. The feature is called Point-in-Time Recovery snapshots. Point-In-Time Recovery snapshots were configured from the following schedule:
  • every 10 minutes in the past hour
    every 1 hour in the past 24 hours
    every 1 day in the past 30 days
  • With cloudEndure customers saved lots of time to set up, maintain and monitor DR.
    Cost reduction due to no actual machines being running it will only run at the time of DR drill.
    No cost and time for training and as such a solution is delivered, maintained, and troubleshot by a partner backed by AWS Premium support.
  • Very simple, transparent and effective solution with the best of TCO.

Conclusion

With the successful achievement of DR, the customer is not just looking for DR for other important applications, but planning other production workloads to be moved to the cloud in near future.

LETS TALK ABOUT YOUR NEEDS