Backup and recovery checklist

14 min read|Updated February 2026

A practical framework for backup strategies that actually work when you need them.

Most organisations believe they have backups. They can point to a backup product, a storage target, a schedule that runs overnight. What they often cannot do is answer a simple question: if everything went down right now, how quickly could you recover, and how much data would you lose?

The gap between having backups and having a tested, reliable recovery capability is where businesses get caught. Ransomware attacks have made this painfully clear. Modern threat actors specifically target backup infrastructure before deploying encryption, because they know that backups are the one thing that lets victims refuse to pay. If your backups are accessible from the same network with the same credentials, they will be destroyed alongside everything else.

This guide provides a comprehensive checklist for building, configuring, monitoring, and testing backup infrastructure that is genuinely resilient. Not theoretical resilience documented in a policy that nobody reads, but practical, verified resilience that has been tested under realistic conditions.

What should be backed up

The first question is scope. Most backup gaps are not configuration problems. They are coverage problems. Data that nobody realised existed, systems that were assumed to be covered, cloud services that were never included. Start by auditing everything your business depends on, then work backwards to ensure each item is captured.

Servers and on-premise infrastructure

This goes beyond file shares. You need full system-state backups of every server, including Active Directory, DNS, DHCP, and group policy configurations. If your domain controller fails and you only backed up the file shares, you are rebuilding your entire environment from scratch. That is weeks of work, not hours. Every virtual machine, every database, every configuration file that your business depends on should be captured in a documented, automated backup job.

Microsoft 365 and cloud SaaS data

This is the single biggest gap we see. Microsoft does not back up your data. Their shared responsibility model is clear: they protect the infrastructure, you protect the content. Exchange Online, SharePoint, OneDrive, Teams channels and chats, Planner boards, OneNote notebooks. If a user deletes a mailbox or a SharePoint site, Microsoft retention policies may help for a limited window. After that, it is gone. Third-party M365 backup is not optional. It is essential.

Line-of-business applications

Your accounting software, CRM, ERP, project management tools, and any specialist industry applications that hold business-critical data. These often run on their own databases, whether SQL Server, MySQL, or proprietary formats. A file-level backup may not capture a consistent database snapshot. You need application-aware backups that understand how to quiesce the database, take a consistent copy, and restore it to a working state. Talk to your software vendors about supported backup methods.

Endpoint and device configurations

In a modern workplace, the laptop is the office. If you use Intune or another MDM platform, your device configurations, compliance policies, and application deployments are recoverable. But what about local data? Users who save files to their desktop instead of OneDrive? Browser bookmarks, application settings, VPN profiles? A well-managed environment minimises local data, but audit what actually exists on your endpoints before assuming everything lives in the cloud.

Network and security appliance configs

Firewalls, switches, wireless access points, VPN concentrators. Every network device has a configuration that took time to build and would take time to recreate. Most support automated config export. Schedule it. Store it securely. When your firewall fails at 6pm on a Friday, the difference between a 20-minute recovery and a 4-hour rebuild is whether you have that config file backed up somewhere accessible.

DNS, certificates, and identity infrastructure

Your DNS records, SSL certificates, code signing certificates, DKIM and SPF configurations, conditional access policies, and Entra ID (Azure AD) configurations. These are the invisible infrastructure that everything depends on. Losing your DNS records means your email, website, and every cloud service stops working. Document and back up everything that connects your identity and access layer to the outside world.

Configuration essentials

Knowing what to back up is only the beginning. How you configure your backup infrastructure determines whether recovery is possible, practical, and fast enough to keep your business running. These are the configuration decisions that separate functional backup strategies from ones that fail under pressure.

Backup frequency aligned to RPO

Recovery Point Objective is the maximum amount of data you can afford to lose, measured in time. If your RPO is four hours, you need backups at least every four hours. For critical databases, that might mean continuous replication or transaction log backups every 15 minutes. For file shares, nightly might be acceptable. The mistake is applying a single schedule to everything. Different data has different RPO requirements. Map them out and configure accordingly.

Retention that accounts for dormant threats

Ransomware can sit dormant in your environment for weeks or months before activating. If your oldest backup is 30 days old and the infection has been present for 45 days, every backup you have is compromised. Best practice for most SMEs is daily backups retained for 30 days, weekly backups for 3 months, and monthly backups for 12 months. This gives you a clean restore point even if the threat went undetected for an extended period.

Encryption in transit and at rest

Backup data is a goldmine for attackers. It contains everything: credentials, financial data, personal information, intellectual property. Every backup, whether stored locally or in the cloud, should be encrypted with AES-256 at rest. Every transfer should use TLS 1.2 or higher. Encryption keys should be stored separately from the backup data itself. If an attacker gains access to your backup storage, encryption is the last line of defence.

Immutable or air-gapped copies

At least one copy of your backup data should be immutable, meaning it cannot be modified, encrypted, or deleted by anyone, including administrators, for a defined retention period. Cloud object lock (AWS S3 Object Lock, Azure Immutable Blob Storage, Wasabi Object Lock) is the most practical approach for most organisations. For the highest assurance, air-gapped backups on offline media provide physical isolation that no software attack can breach.

Separate credentials and access controls

Your backup infrastructure should use completely separate credentials from your production environment. If an attacker compromises your domain admin account, they should not automatically gain access to your backup platform. Use dedicated service accounts, enforce MFA on all backup console access, implement role-based access control, and ensure that no single person can both access production systems and delete backup data.

Versioning and point-in-time recovery

Multiple versions of the same data allow you to roll back to a specific point in time. This is critical when you discover that files were corrupted or encrypted days before anyone noticed. Without versioning, you are limited to the most recent backup, which may already contain the damage. Configure your backup solution to maintain multiple recovery points across your retention window.

“A backup that has never been tested is not a backup. It is a hypothesis. And you do not want to test hypotheses during a ransomware attack at 3am on a Sunday.”

The 3-2-1-1 backup rule

The classic 3-2-1 rule has been the foundation of backup strategy for decades. In the age of ransomware, it needs an update. The additional “1” represents immutability: at least one copy that cannot be modified or deleted by anyone, including administrators with full access to your environment.

Copies of your data

Production data plus two independent backup copies. Redundancy is not optional.

Different storage media

Local disk and cloud, or cloud and tape. Different failure modes protect against different risks.

Copy stored offsite

Geographically separate. Protects against fire, flood, theft, and site-level disasters.

Copy that is immutable

Cannot be altered or deleted. This is your ransomware insurance policy.

Monitoring and alerting

Backups fail silently. Storage runs out, credentials expire, agents crash, network paths change. Without active monitoring, you will not know your backups stopped working until you try to restore from them. By then it is too late. Treat backup monitoring with the same urgency you give to security alerting.

Automated job monitoring and alerting

Every backup job should be monitored automatically. Success, failure, partial completion, warnings. If a backup fails at 2am, someone needs to know by 8am. Most modern backup platforms support email alerts, webhook integrations, and dashboard reporting. The key is ensuring that alerts actually reach someone who will act on them, not disappear into a shared mailbox that nobody checks.

Capacity and growth forecasting

Backup storage fills up. It always does. When it does, backups silently fail. Monitor storage utilisation across all backup targets, set threshold alerts at 70% and 85%, and forecast growth based on historical trends. Running out of backup storage on a Friday afternoon, when your cloud provider needs 48 hours to provision additional capacity, is entirely preventable.

Backup integrity verification

A backup that completes without errors is not necessarily a backup that will restore successfully. Enable checksum verification on all backup jobs. Many platforms support automatic integrity checks that validate backup data against the source after each job. This catches silent corruption, storage failures, and transmission errors before they become recovery failures.

Reporting and compliance documentation

Maintain a weekly backup health report. This should cover job success rates, any failures and their resolution, storage utilisation, and the date of the last successful test restore for each protected system. This documentation serves multiple purposes: operational visibility, audit readiness for Cyber Essentials and ISO 27001, and evidence for cyber insurance claims if you ever need to make one.

Testing your restores

Testing is the part that most organisations skip, and it is the part that matters most. Your backup is only as good as your last successful restore test. Without regular, documented testing, you are relying on hope, and hope is not a recovery strategy.

Monthly file-level restore tests

Pick a random selection of files and folders from different backup sources. Restore them. Verify they open, are complete, and match the originals. This catches the most common backup failures: permissions issues, path-length problems, file corruption, and configuration drift. Document the results. If a monthly test fails, you want to know immediately, not when a real disaster strikes.

Quarterly application and system restores

Restore an entire server or application to an isolated test environment. Boot it. Log in. Verify the application works, the data is current, and the system functions as expected. This tests not just data integrity but the entire recovery process: the procedures, the documentation, and the team's ability to execute under time pressure. Time the exercise. Your actual RTO is whatever this test tells you it is.

Annual disaster recovery simulation

Simulate a complete site failure. Assume production is gone. Can your team recover critical systems within your stated RTO using only documented procedures and backup data? This exercise exposes gaps that smaller tests miss: dependencies between systems, the order in which services must come back online, licensing issues with restored software, and the human factors around communication and decision-making under pressure.

Post-change validation

Any significant infrastructure change should trigger backup verification. New server deployments, application upgrades, storage migrations, network changes. The backup configuration that worked before the change may not work after it. A five-minute validation check after every change prevents the scenario where you discover your backups stopped working three weeks ago and nobody noticed.

60%

of SMEs that suffer data loss close within six months of the incident

93%

of companies that lose data for 10+ days file for bankruptcy within a year

14 days

average time to recover from a ransomware attack without tested backups

Common backup myths

Misunderstandings about backup are widespread, even among technically competent teams. These myths persist because they sound reasonable on the surface, but each one represents a genuine risk to your recovery capability. If any of these sound familiar, it is worth re-examining your assumptions.

Microsoft backs up our 365 data

Microsoft provides infrastructure resilience, not data backup. Their SLA covers service availability, not your content. Deleted items are recoverable for a limited window (14-93 days depending on the service), after which they are permanently gone. Retention policies are compliance tools, not backup tools. They cannot restore a mailbox to a point in time three months ago. If you depend on Microsoft 365, you need a dedicated third-party backup solution.

We have RAID, so we are protected

RAID protects against disk failure. It does not protect against ransomware, accidental deletion, corruption, fire, flood, or theft. RAID is an availability control, not a backup control. We have seen organisations lose everything because their 'backup' was a RAID array in the same server room. When ransomware encrypted the production data, it encrypted the RAID array too. RAID and backup solve different problems.

Cloud data does not need backup

Cloud providers protect the infrastructure. You protect the data. If someone deletes a SharePoint site, overwrites a critical file, or an attacker compromises an admin account and wipes a tenant, the cloud provider is not responsible for restoring your content. Cloud-to-cloud backup is the standard recommendation for every SaaS platform that holds business-critical data. The cloud is someone else's computer, and they are not backing up your files.

We have never needed our backups

The fact that you have never had a fire does not mean you cancel your insurance. Backup is your last line of defence against data loss, whether from ransomware, hardware failure, human error, or malicious insiders. The cost of implementing and maintaining a proper backup strategy is a fraction of the cost of losing your data. Every organisation will eventually need to restore something. The only question is whether the backup will be there when they do.

Backup is too expensive for our size

Modern cloud backup for a 20-person organisation typically costs between 150 and 400 pounds per month, depending on data volumes and retention requirements. Compare that to the average cost of a data loss incident for an SME, which runs into tens of thousands of pounds in lost productivity, customer impact, regulatory penalties, and recovery costs. Backup is not an IT expense. It is business continuity insurance, and it is among the most cost-effective investments any organisation can make.

Syncing is the same as backup

OneDrive sync, Dropbox sync, and Google Drive sync are not backups. They are synchronisation tools. If ransomware encrypts files on a synced device, the encrypted versions sync to the cloud and overwrite the originals. If someone deletes a folder locally, it deletes in the cloud. Sync provides convenience and availability. Backup provides point-in-time recovery and protection against corruption, deletion, and encryption. They are complementary, not interchangeable.

“Ransomware does not just encrypt your files. It encrypts your options. Immutable, tested backups are the only thing that gives you a choice when everything else is gone.”

Need help with your backup strategy?

We help UK businesses design, implement, and manage backup infrastructure that is genuinely resilient. That includes M365 backup, immutable cloud storage, ransomware-resistant configurations, and regular tested restores with documented evidence.

If you are not sure whether your current backups would survive a real incident, a backup health check takes around an hour and will give you a clear picture of where you stand and what needs addressing.

Book a call Security services