Architecture Strategy for Meeting Performance NFRs within a Digital Transformation

Identifying your most important NFRs is important as early as possible in your digital transformation journey. Good architects make sure there are no disconnects here.

“Performance” is one such NFR that requires a broad and clear understanding across your entire program. A good architect will make sure that they have the right information and projections from their reliable and accountable business sources in order to make determinations about technology and patterns to be used to support those projections. They will then translate this into technical data to support design decisions relating to performance, load, and scale.

A good architect will also validate this information because, like I always say, not only are technical leaders accountable for their decisions, they are accountable for the quality of information that their decisions are predicated on.

Architecture and design work takes these NFRs into account, but NFR requirements aren’t ever met until the systems are built and validated.

Everything from architecture, to build quality, executive sponsorship, and project management could reduce the likelihood of meeting performance or other NFR objectives.

The following points are crucial to ensuring your performance objectives are met:

  1. Consistent and clear communication about the need to meet performance objectives and what it means if you don’t meet them.
  2. Clearly articulate the risks. Is your biggest risk that the system will just run a little slow? Is it something that is acceptable? Or, is the risk much greater and not meeting the demands of the system due to poor performance means system crashes repeatedly under load leading to failure, service restarts, and information loss.
  3. Alignment on risks with program team, business, and sponsors along with what the business impact is.
  4. Documentation that remains up to date, widely distributed, and work tracked as part of your ALM/PM/Agile system (Jira, DecOps, etc)
  5. Development leadership to ensure everything going into the system has been implemented in a way that will help meet performance objectives (example: leadership to the development team so that they understand the importance of and usage of parallelism, multi-threaded processing, async operations, and batching, as examples)
  6. Testing in place, planned, and aligned with the program to validate performance of components as they are built. Don’t save this until the end of the program.
  7. Systems and teams in place in order to orchestrate the required performance and load testing of the entire system — for example: Do you need new enterprise performance testing software and expertise? Can you use existing in-house resources and systems?
  8. Defined and aligned on minimum metrics to meet, that if not met, could delay the promotion into production.
  9. Ensure you (or the lead architect for the program) has a seat at the table when meeting with program sponsorship and steering committees in order to provide a balanced perspective on the intersection of the business value to technology and any risks entailed. This also ensures that clear and accurate messaging is propagated and not filtered through the lens of the business or project management.
  10. Architecture and design work that specifically calls out where patterns and specific approaches are to be used in order to meet performance NFR objectives, and that specifically points out the areas of the system that are most critical for performance work.

As you see, there are a lot of moving pieces involved in ensuring you’re going to end up with a system that adequately handles the load, including peak load, and performance needs.

There are so many projects that fail because there wasn’t enough consideration and alignment to the need for performance and scale from either architecture or management. I’ve seen projects (not my projects) go into production and fail the first day because it just couldn’t handle the load requiring the project to be pulled (very expensive) and re-worked for 6 more months – the rush to production didn’t pay off, and it’s (un)surprising how often this happens.

It’s too easy for many architects to be oblivious to or let this slide under the rug, not be pro-active, and not call out bad behaviours, anti-patterns, or lack of progress in meeting performance NFRs.

Naturally, if a program fails and performance is to blame, we look at the architect and the architecture — with good reason because one of the responsibilities of the architect is to make sure we have alignment across all of the moving pieces required to make sure NFRs like performance will be met and are being met within the architects domain. The architect is a primary strategic and technical stakeholder who sets the technical direction along with technical leads and SME’s who also have responsibility as per their direction and vision as aligned to the architecture.

Architects need to be pro-active, and they need provide leadership. If something becomes critical, for example, the project team removes performance testing from scope or continually defers it for short term gain, then it needs to be called out, and serious discussions need to be started about how we can reasonably meet performance requirements and maintain a steady pace to production.

Some architects will say, “It’s not my job. My job is to provide a design and if it’s not followed and they want to do something else then that’s up to them”… or… “I don’t want to speak up because I know they don’t want to hear it”…or…”nobody else is bringing it up or concerned, so”… These are all signals of avoidance, and this is a bad value proposition. It’s the kind of(un)value proposition that leads directly to program failure due to the lack of leadership coming from architecture.

Part of an architect’s job is to lead and to ensure and maintain alignment throughout a program up until production launch and even beyond.

As an architect, you want to win. You want things not to fail.

Of course, within your realm there are other people that make decisions and provide leadership also, including executives. It’s completely reasonable within their role to be able to make decisions even if other people, including architects, disagree with them.

These decisions could negatively impact the vision or reduce the likelihood that NFRs are met (to keep an example within context) but sometimes these decisions need to be made, so it’s important that in these cases that the architect works with the program team to ensure that the decision is predicated on accurate information, and to ensure and validate that the appropriate risk acceptance is to be in place and documented by the program team. Many people think accepting performance risk means accepting that the system will run a little slow — that’s a huge fallacy. Architects must educate and clearly articulate what it actually means – or could mean. “There is a 25% chance based on recent testing that we’ll intermittently lose customers production data in a 3 month period if we go into production like as-is” is a completely different paradigm than “it will just be a little slow”.

Ensuring risk acceptance where it’s needed when a solution will not be compliant with the architecture or agreed upon NFRs is a responsibility of architecture and not something that can be relegated to program teams at their own discretion. Architects own the vision of how technology is going to achieve business success and they own the core facets of the architecture required to achieve that success, and therefore they are accountable to identify and raise the issue when core facets of the design are not implemented.

Going into production with missed requirements requires that the appropriate business sponsor has a clear understanding and is willing to accept the business risk, and therefore adequate translation of what the risk means to the business is required. There may be some amount of risk for technical teams to accept (such as increased support required in the event of failure), but remember that it’s the business that ultimately accepts and takes ownership of the risk for the business impact and not the technical leads or teams.

Enjoy 😉

-Dan

Post a Comment

Your email address will not be published. Required fields are marked *