Upgrading the Data Protection, Disaster Recovery & Backup service for all companies on Intility

Erfan Mohammadi
Written by 
Erfan Mohammadi
February 3, 2022
Approximately a 
00
 
min read
Written by 
Erfan Mohammadi
February 3, 2022
Approximately a 
14
 
minutes read
Platform
In November 2021, the Intility Data Protection, Disaster Recovery & Backup service was upgraded with Immutability, inter-regional replication to a Dark Site and higher performance. In this article we will take you through Intilitys thought process of choosing and implementing the new service.

Protecting data

The famous 3-2-1 backup rule states that there should be 3 copies of data, where 2 of them are on different media and 1 copy being off site. At Intility we have taken this a step further, standardizing on 4 copies of all data, all being on different storage media and storing both storage and backup geo-replicated.

Intilitys Backup Platform is delivered as a service to all our customers and is a key element in our work with Data Protection, Disaster Recovery and Business Continuity. Although our primary DR-mechanism is based on Storage geo-replication of production data, backup serves as an additional security layer for Ransomware protection, data retention, corrupted data, deleted files and more. Indeed, in these cases backup will serve as the primary DR-mechanism, and we are therefore dependent on having a robust and secure backup platform.

Why Intility is implementing a new backup technology

Following Intilitys industrialized approach to technologies on our platform, we standardized on a single backup vendor at the last crossroads of choosing backup software in 2015. At that time Actifio was the preferred option, presenting a modern approach to backup software with capabilities such as Golden Copy, scalability and instant restore.

Intility is constantly keeping up to date on developments in technologies and potential partners for our Platform eco-system. Even though we had a main partner for backup software, we started a more comprehensive process of assessing the current backup market in 2020. New areas of interest had arisen, including Immutability, more modern approaches to data storage, scalability, management of tens of thousands of VM’s, and ease of automation. Shortly after we started this process news broke that Actifio had been acquired by Google and that the Actifio backup platform was to be moved in its entirety to Google Cloud Platform – further strengthening our need to look at alternatives as we for various reasons want to keep our primary backup platform within our own Cloud platform.

The process of choosing a new backup partner

When Intility choose new technology partners, we are in effect choosing a technology for more than 600 companies on the Intility Platform. Following our Platform model, we feel a tremendous responsibility in making the right choices to support our customers’ digitization endeavors and their IT security.

At the early stages of the process, a list of need-to-have and nice-to-have criteria was prepared using a weighted model. In addition to using the expertise and knowledge of senior technicians at several Platform areas of Intility, professional literature, from amongst others Gartner, was essential in choosing our backup strategy. The key elements included:

  • Security (Immutability, access control, lock-down & hardening options, encryption, granular control, compliance & certifications)
  • Administration (GUI, API, single pane of glass, multi-tenancy)
  • Scalability (S3 object storage, web-scale)
  • Monitoring (statistics, logs, performance)
  • Virtualization-engine support and integrations (VMware, Hyper-V)
  • Physical servers, Agent and Edge support
  • Public Cloud support (Azure, AWS)
  • Synergies & integrations with other Intility Cloud Platform offerings
  • Restore capabilities (cloning, live-mounting, mass restore, single-file restore, indexing)
  • IAM (access granularity, AD, RBAC, nested groups etc.)
  • Modern workload support (containers, object storage, Kubernetes, edge)
  • Database support (SQL, noSQL, MySQL, MariaDB, Oracle etc.)
  • Support organization
  • Cost & pricing model

Several backup vendors were invited to the process which spanned six months. In a series of meetings, presentations and technical deep dives with chosen vendors, we recognized Cohesity as the preferred partner in May 2021. Following this assessment, hardware was ordered and arrived in June, at which point we started an extended Proof of Concept process. The choice of ordering production hardware was made, introducing cost risk, but on the other hand best reflecting the actual performance of the product. The PoC was successful and Intility went into a long-term partnership with Cohesity in July 2021.

Main reasons for choosing Cohesity as a partner

Cohesity proved to have a comprehensive approach to Data Security. An immutable file system, multi-factor authentication, Data Lock (WORM), no root access, no explicit Sudo access, ongoing work with making the underlying OS disappear, built-in AI ransomware detection as well as several security certifications were key in gaining confidence as to the security standards of Cohesity.

Also, Cohesity’s product offering is somewhat unique in the world of Backup. In addition to offering an extensive backup software, Cohesity acts as a Software-defined Data Platform, giving the ability to consolidate backups, files shares, object storage and other data using a web-scale data management platform. This creates synergies with Intilitys other Cloud Storage offerings, especially Object Storage and File Storage in that they can run on the same software platform (although being physical and logically separated from the backup environment). This will present opportunities to further strengthen our Storage offerings and integrations in the coming years.

The Cohesity Data Platform also introduces a more modern and scalable approach to storing the backup data using an underlying Object Storage system. On our previous backup platform, data was residing in a somewhat silo-based infrastructure. The infrastructure was separated into hundreds of virtual nodes with set storage limits. In moving to a web-scale storage solution, we can address the totality of backup storage within a single namespace, enabling us to scale much more efficiently to meet ever-increasing data growth.

Broad support for workloads beyond virtual machines, such as Kubernetes, Object Storage and Edge will also help us in protecting modern workloads with the same level of security as more traditional workloads.

Immutable backup

Backups have an intrinsic ransomware-protection in that the backup system typically holds several historic copies of the workload. At Intility we have standardized on 14 daily, 5 weekly and 3 monthly copies for most workloads (6 months for file services), also offering longer retention on demand. In practice, the oldest restore point serves as the limit in time as to the detection of ransomware. In other words, with our standard policies, from the start of a ransomware attack on a workload, it must be detected within 3-6 months. Once detected, one would typically find and restore to the latest available healthy copy.

However, this intrinsic protection is only applicable to protecting the workload itself (in most cases a virtual machine). Newer forms of attack include the will and ability to encrypt or delete the backup copies in addition. In Norway, the municipality of Østre Toten received a lot of attention this year, following an attack which encrypted their data as well as deleting their backups.

In response to these rising threats, we identified the need to protect the backup using newer technology. In using Data Lock Immutability, we are adding a layer of protection for our customers, addressing attacks on the backup data itself. The function removes the possibility, even for administrator accounts, to delete the backup copies once it has been written to the backup system.

The technology utilizes the WORM (Write once – Ready Many) feature which is native to S3 Storage. The method is recognized as modern and secure approach to Data Protection. In addition to protecting from ransomware attacks, the technology can also be used for regulatory purposes of data retention. In our new backup platform, the Immutability is certified for:

  • Securities and Exchange Commission (SEC) 17 CFR § 240.17a-4(f), which regulates exchange members, brokers or dealers.
  • Financial Industry Regulatory Authority (FINRA) Rule 4511(c), which defers to the format and media requirements of SEC Rule 17a-4(f).
  • Commodity Futures Trading Commission (CFTC) in regulation 17 CFR § 1.31(c)-(d), which regulates commodity futures trading

Backup, Immutability and GDPR

Backup of data in relation to GDPR regulations is a complex matter, and it can be argued that some of the regulations are directly at odds with each other. This is especially true for the balance between Security of Processing (Article 32) and Right to Erasure (Article 17). Article 32 addresses confidentiality, integrity, availability and resilience and specifically states 1(c): “The ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident”. In isolation, by article 32 standards, Immutability is preferable.

On the other hand, Article 17 states that the data subject has the “right to be forgotten” without undue delay. Many organizations are setting the target time frame for deletion to 30 days. One might be able to delete personal data from production systems within this time frame, but it is a significant challenge when it comes to backup. The nature of backups makes it virtually impossible to pin-point deletion of data within a backup chain, unless all restore points are full-copies, and not incremental (which would represent a very inefficient backup system). Backup Immutability only adds to this challenge as the data cannot be deleted before retention runs out. The challenge also grows in direct relationship to longer retention times such as yearly backups.

A proposed way to address these challenges has been to keep track of request for deletion and having routines for immediately deleting once a potential restore is performed. Some of the main challenges in using this method is maintaining such a system and not least indexing where personal data resides in production systems as well as backup systems. Addressing the latter, Intilitys new backup platform has a built-in global indexing service, making the search for specific data more efficient across backup retention points (and by extension also production data). The software also has capabilities to search for personal data, credit card information etc. using artificial intelligence, but for security reasons Intility has foregone that feature as it is dependent on analyzing the backup data in a SaaS- Platform outside our Data Centers.

Pending further guidelines on how to address these GDRP backup challenges, we have concluded that Immutability will follow our standard retention times of 3/6 months. For yearly backups or monthly backups between 6-12 months retention, Immutability will not by standard be applied. Disregarding these challenges, there are also other reasons why Immutability on yearly backups is not desirable; mainly surrounding cost and inability to meet potential future changes in an organization’s internal data retention polices.

The process of choosing the Backup architecture

In parallel with choosing a backup technology, efforts were being made in improving the overall architecture for Intilitys Data Center Platform. In early 2021, we were confident that the current architecture of Intilitys central Data Center sites in Norway should undergo changes. Through a separate process of assessing the current Data Center market in Norway, as well as reimagining our current Data Center architecture in terms of functional needs, we started a signature project for central Data Centers. The project is ongoing and Intility is upgrading the Data Center Platform with new sites which are being put into production in November 2021.

In addition to what we are doing for our Oslo Data Centers architecture, we started assessing the possibility for an inter-regional replication of all customer data to further strengthen Data Protection and Disaster Recovery mechanisms. Stavanger in the Western region of Norway was deemed as a suitable candidate for a cross-region replication.

The process of choosing an inter-region Dark Site

Following our definitive requirement of backup Immutability, we wanted to secure the second replicated backup data set as best we possibly can. The second replicated copy would serve as the last line of defense for all data on the Intility platform. Two risk-scenarios were especially addressed: a regional natural disaster in Eastern Norway and a platform-wide Ransomware attack.

The decision was made to choose a secure data center site which we would ultimately set up as a Dark Site. Green Mountain was identified as the preferred partner, following our decision process for Data Center Partners which include emphasis on key areas such as Security, Compliance & Certifications and Sustainability. Green Mountain Rennesøy is an EMP-secured, TIER 3-classified High Security facility deep inside mountain halls which was previously used as a NATO facility meant to sustain nuclear attacks.

Is the secondary site Air-gapped?

Many IT-vendors are freely, without restraint and often mistakenly using the term Air-gapping when talking about Data Protection. By its strictest (and in my view correct) definition, Air-gapping entails total isolation and preventing the possibility for the system to establish external connections. This has typically been implemented in strict high-security environments such as military and government institutions, or where industry espionage is prevalent.

In the world of backup, air-gapping has typically been solved by using tape drives which are regularly (for example weekly) physically removed and moved to a secure location (for example as a safe). Keeping one or more offline copies of the data arguably represents the highest level of Data Protection achievable, but typically at the expense of higher RTO, RPO and cost.

Intility chose not to standardize on air-gapped backup for several reasons. The air-gapped approach is more complex, less functional, less scalable, and more costly. As is often the case in assessing security needs, functionality on the one end and security/lock-down on the other end are acting as counter-poles. This is indeed the case with air-gapped backup too, so we instead chose a different route, including a high level of security and isolation using the term Dark Site. The latter term is also used in different ways but doesn’t have a as strict definition.

Intilitys Dark Site approach

It’s important to point out that the primary protection mechanism in the Backup Platform is the Immutable File System. As previously noted, Immutability inhibits attempts to alter or delete backup data within the file system, positioning the technology as a key protection against Ransomware-attacks. That said, Immutability alone is not enough to ensure Data Protection as there will always be a risk that the underlying storage system can be compromised by gaining administrative privileges on the hardware, and data can be deleted by mechanisms such as disk partitioning on the physical disks.

So, we asked ourselves and our partners the question: what can we do to best protect the replicated backup copy in the absence of offline copies? As we wanted to replicate the backup data on a daily basis to ensure high levels of RPO, an active connection to the central data centers was deemed beneficial. But the network connection between Oslo and Stavanger was not our primary concern; in assessing the security risks we firstly assumed that the connection was compromised and that an attacker had gained network access. As such we started with the hardware elements:

First, we assessed the need for hardware on the site. Although it was somewhat tempting to leverage the investments in data center facilities in Stavanger to broaden our Cloud offerings such as Compute & Storage, we established that we should forfeit that opportunity and instead focus on isolating the site. The decision was made to exclusively host the Backup Platform – consisting of Storage hardware, UPS’s, a single motion-detection camera (on an isolated network) and strictly necessary network equipment to sustain the replication traffic from Oslo.

Secondly, our main concern with the hardware was the risk of gaining remote access to the underlying backup storage or network equipment. ILO (Integrated Lights Out), which is normally used for remote administration needed to be isolated. Normally, Intilitys technicians use ILO in day-to-day operations to efficiently and remotely control, monitor, patch firmware and generally manage data center infrastructure. We started dissecting these disciplines and assess whether the backup software could perform the necessary actions from within the system, essentially removing ILO-access from Intility user accounts.

At this point, we are getting into the intricacies of how the Dark Site is secured on a hardware-level, and in the interest of keeping the inner workings confidential, it’s limited what we can share. In essence, we must be physically on site to perform various actions as our remote management capabilities for storage and network are disabled. Additionally, surrounding security measures include a comprehensive regime of securing administrative accounts utilizing a combination of keeping Cluster admin login credentials in physical form using a physical safe at an undisclosed secret location, Intilitys Tier-based AD model, read-only accounts and strict routines for physical access to the site. The backup administrator role at Intility, which consists of a few named users, has their privileges kept to a minimum, disabling actions such as sudo-access and user administration.

Thirdly, only backup replication traffic and backup software communication is allowed on the network connecting Oslo to Stavanger. The link itself consists of a 100 Gbps dedicated dark fiber and all backup traffic is encrypted. There is no internet access at the site.

With these mechanisms in place to minimize attack vectors, we believe we have found a sensible balance between security and function for the Backup Dark Site.

Increased performance and reduced workload impact

Backup has typically been hosted on low-cost hardware, with HDD disks or tape drives and low CPU/Memory configurations. For many organizations this has been an important step in reducing overall costs for an ever-increasing amount of data.

Nonetheless, newer requirements such as faster restores to reduce RTO, faster backup runs to reduce the impact on sensitive workloads as well as the benefit of using the backup data to draw insights is challenging the old hardware approach.

In the new backup platform, the majority of data will still reside on HDD drives to lower cost, but several other hardware elements make the platform considerably more performant:

  • A higher CPU & Memory to disk ratio increases makes indexing efficient as well as enabling better data reduction mechanisms such as compression and de-duplication
  • SSD-caches in front of the HDD drives increases ingest and restore rates, delivering higher overall IOPS
  • Scale-out Object Storage clusters makes the data operations more efficient

These components in sum will make us capable of achieving:

  • Lower RPO, specifically with second- or minute interval Continuous Data Protection which the platform is now rigged to handle
  • Lower RTO, specifically with Mass Restore capabilities and instant restores running on SSD’s
  • Lower impact on production workloads, specifically with the Platform’s increased ability to ingest backup jobs at a faster pace

Even though the mentioned hardware components are increasing our cost, this effect is to a certain degree countered by higher levels of data reduction. Notwithstanding cost, the sum of all benefits makes it worthwhile to think in new ways, increasing the performance of backup systems.

Managing several thousand VM’s with set daily or hourly timeframes for backups, performance is key in delivering a stable Cloud platform to our customers. Even though the Dark Site is not set up in way as to resume fully operational production workloads (in the absence of local Compute & production Storage), we still wanted the capability of mass restores to strengthen RTO in the event of major disasters. One such scenario could be to quickly recover from a platform-wide ransomware attack where we could utilize the backups in Oslo, which are connected to full Compute, Storage & Network Disaster Recovery capacities.

Increased insights for our customers

A great effect of higher performance on the Backup system is that indexing of data can be done more efficiently. Indeed, backup data is very suitable for indexing as the data is cold and an otherwise performance-taxing operation on production data can be avoided. We will continue to work on leveraging the global indexing capabilities within the platform to present our customers with more insight into their data.

We are also working with making available increased insight for Customers on their backups, including consumption data, policies, SLA reporting and more. By leveraging the new platform’s strong API-support, Intility has developed new internal dashboards, where much of the data will find it’s way to the customer portal going forward.

Intilitys new Backup Portal

Looking forward

Backup technologies has been around for a long time and is a key element in protecting data, managing incidents, and adhering to regulatory & compliance standards. In some ways, the old principles of backup have withstood the test of time, but newer technology have modernized the methods. We believe that our new and modernized Backup platform serves as a great foundation to continue evolving and improving our Security & Cloud platforms, addressing an ever-changing threat landscape as well as developments in Cloud workloads. We will continue to finesse our Dark Site approach as new lock-down methods arise and challenge our partners on integrations and capabilities for broader workload support, especially when it comes to newer database engines. The Cohesity platform represents a great portfolio of capabilities, but we have foregone capabilities such as AI Ransomware detection and AI sensitive data identification, as these are only available in a SaaS platform. We will continue our endeavors to make sure these features are also included for all our customers without compromising data security. Finally, the new platform has an API-first approach which enables us to automate more backup tasks - increasing our productivity, reducing human errors, enabling more control over the backup environment, and providing more insight to our customers.

If you have any questions, please contact erfan.mohammadi@intility.no

Table of contents

if want_updates == True

follow_intility_linkedin

Other articles