Skip to Main Content

COVID-19: Navigating the impact to your business. See our top strategies

Resiliency Engineering

How to strengthen critical platforms to avoid outages and recover quickly from failures.

Financial services institutions are rightly using the latest technologies to modernize trading, real-time analytical decision-making and business operations. However, it’s just as important to ensure the various components form a resilient platform that can withstand the tribulations of black swan events.

Recent troubles and outages have shined a light on the vulnerability of critical platforms and the necessity for new tools and processes that ensure they will stay strong. COVID-19 isn’t the only culprit. Brexit, interest-rate announcements, large market movements and negative oil prices have all strained and even broken critical platforms – exposing weaknesses that were always there.

Site Reliability Engineering (SRE) is the formal engineering method for creating reliable software systems. Within that discipline, Publicis Sapient offers a solution that integrates critical platforms into chaos testing scenarios and tracks their fortitude with advanced monitoring tools.

We call it Resiliencey Engineering

The resiliency engineers at Publicis Sapient can identify specific problem(s) in about six weeks. We review platforms and patterns against our proprietary Resiliency Maturity Model and tell companies how to remedy their weaknesses and blind spots. 

From the entire platform to individual application architecture components, to critical downstream interactions, we evaluate availability as individual units and a cohesive platform, rolling out resiliency capabilities across the organization.

a flow chart

Companies need to test how the components of their technological infrastructure react under highly volatile and possibly dire circumstances.

Publicis Sapient shows financial institutions how to establish an end-to-end SRE culture and provides the right tools, techniques and accelerators for the job. For comprehensive assignments, our teams can perform complete resiliency engineering for complex applications in up to five months.

Many financial institutions have suffered significant outages in recent years – Chime, Robinhood, etc. – because their platforms couldn’t handle so many different components and volatility.

In the past, this wasn’t as much of a problem because they used siloed legacy platforms that did not communicate with customers. However, when a stock trading platform fails during a run on the market the company may be liable to compensating users for their losses.

In other words, customer-facing apps need to be available 24/7.

Companies have to transform digitally so they can scale up digital products and services. They need to migrate legacy technologies to new platforms. When you have so many components talking to each other, companies should consider outages, performance testing and resiliency, but many are not.

Based on practical experience, we help accelerate our client journey towards measuring and improving platform resiliency through three pillars:

Three pillars

Engineering transformation has always been a key concern for our teams whenever helping financial services clients. They have focused intently on measuring, monitoring and pro-actively addressing the resiliency concerns of critical platforms.

This focus has only intensified in light of new regulatory requirements in the United Kingdom and United States, decentralized compute and services (cloud-based solutions, FinTech platforms, microservices, etc.) and continued volatile environments.

 

Gaurav Verma
Gaurav Verma
Senior Director of Program Management

Related Reading