To adapt a famous literary phrase, the end-of-year sales period can often be the best of times or the worst of times for businesses. Those that have prepared beforehand for the influx of customers can offer a better sales experience than those who didn’t—and this is still the truth post-pandemic.

Only this time, customers will visit digital storefronts more than physical ones, so the question is: how well-fortified is your online infrastructure for the incoming wave of customers looking for a smooth and stable EOY shopping experience?

Fortunately, there’s still time for retail businesses to prepare. Establishing robust back-end infrastructure to support thousands of web visitors and just as many transactions per minute can take time, however, it can also be achieved with small incremental steps. Here are four things retail IT teams can do to prepare for the year-end onslaught.

1. Get network data or information you need, no matter what

Whenever technical issues occur, IT teams must use whatever information is available to make split-second decisions that will hopefully prevent a crisis. But useful information, let alone accurate information, is difficult to obtain, even when retail businesses were still running their on-premises infrastructure. Now with most migrating to cloud services or SaaS solutions, retrieving critical network information is nigh impossible.

This is because third-party vendors are frustratingly unwilling to provide visibility into their back-end systems with clear information on why parts of your network are unresponsive. To counter this ambiguity and get deep clarity, businesses must expand their network monitoring and management capabilities. Prioritise solutions providing network mapping, real-time alerts, and logs designed to help IT teams keep up with the changing demands of EOY sales—and beyond.

2. Stress test constantly, like the business depends on it

This deep level of network visibility provided by a network monitoring solution will be essential for stress testing. And I don’t mean the structured, quality assurance ensuring everything works per a script. Bombard your network with repeated service requests and simulated traffic, then monitor the data logs to spot parts of the network crumbling and in need of propping up.

More importantly, scale up your stress tests to account for the even greater volumes of anticipated traffic this year. Conducting stress tests at scale provides businesses with a better idea of how far existing infrastructure and systems can stretch, allowing them to proactively provision for extra solutions or capacity.

It also highlights areas of the network prone to hiccups during peak traffic periods, which in turn informs backup and recovery efforts that are essential in keeping online storefronts operational and customers happy.

3. Build replicability into “high availability” systems 

Many retail businesses initially moved to cloud-based infrastructure because they thought it might be more cost-efficient than maintaining on-prem infrastructure. For many organisations, this proved to be less true than they’d hoped. But never forget the cloud’s other advantage – its scalability. On this, cloud has delivered in spades. Network resources and capacity on the cloud can easily be scaled up or down with just a few clicks, making networks more resilient. Take advantage of that!

Build or migrate mission-critical systems requiring high levels of availability in these environments and ensure elements of these systems can easily be replicated should the need arise. For instance, ensure websites or customer databases are deployed on cloud environments allowing for easy scaling of capacity or performance. Enable the same level of replicability for on-prem, containerised, or virtualised services, and you’ll be prepared to an extent whenever traffic surges occur.

And while it should go without saying, I’ll say it anyway—moving anything from on-prem to cloud requires you take the proper steps to ensure the migrated system is secure. Because nobody is going to care how scalable or robust your application is if it’s not secure.

4. Deprioritise backups and prioritise restores

It’s surprising how many businesses have a comprehensive backup plan involving regularly scheduled backups of mission-critical systems, but don’t pay as much attention to the restoration of those backups. Knowing which systems to restore first, which restore points are stable, and how fast restoration must occur during an outage—these are arguably equally as important as having an extensive library of backups.

For restoration, the two most important metrics to focus on are Restore Point Objective (RPO) and Restore Time Objective (RTO). The first essentially determines the minimum amount of your network that should be restored to ensure continued stability, while the second tracks how much downtime is acceptable before the customer experience gets impacted. Use your simulated stress tests to determine both baseline metrics and ensure your IT team has a restoration plan in place in case the inevitable happens.

Altogether, these measures should keep the EOY shopping experience smooth for your customers and stress-free for your IT team. But even when every precaution is taken, retail businesses would do well to expect the unexpected and prepare for the worst. Nothing enables issues to crop up, cascade, and impact the customer experience more than a lack of vigilance. Only businesses that pay attention to the needs of both their customers and network will come out on top this holiday season.

Leon Adato is Head Geek at SolarWinds.