How to use New Relic for Performance Engineering and Load Testing


Performance engineering and load testing are critical parts of any modern software organization’s toolset. In fact, it’s increasingly common to see companies field dedicated load testing teams and environments—and many companies that don’t have such processes in place are quickly evolving in that direction.

Driven by key performance indicators (KPIs), performance engineering and load testing for software applications have three main goals:

  1. To prove the current capacity of the application
  2. To identify limiting bottlenecks in the application’s code, software configurations, or hardware resources
  3. To increase the application’s scalability to a target workload

More specifically, a typical load test might look like this:


While there are plenty of tools available for generating the user load for a performance test, the New Relic platform (particularly New Relic APM, New Relic Infrastructure, and New Relic Browser) provides in-depth monitoring and features that can give crucial insights into the analysis of such tests—from browser response times to user sessions to application speed to utilization of backend resources. Teams that instrument their load testing environments with New Relic get complete end-to-end visibility into the performance of their applications.

This blog post presents a prescriptive 12-step overview (divided into 3 parts) of how to use New Relic for methodical load testing and root-cause analysis in your performance engineering process.

Part 1: Set a baseline and identify current capacity

The first step is to set up a load test and slowly increase the load until your application reaches a bottleneck.

1. Starting with a minimum user load (for example, 5 concurrent users), execute a load test that lasts at least 1 hour. The result of this low load test will serve as your baseline.

Pro tip: If the results of your baseline load test show that transactions exceed your service level agreements (SLAs), there’s no reason to further test for scalability. You can proceed to the next step.

2. Using the baseline load test results, set an acceptable Apdex score for your application. The Apdex will be your gauge for the average response time of your application. Create key transactions for those specific transactions that execute longer than your overall SLA. For example, for a typical web application, the Browser Apdex value could be 3 seconds. The APM Apdex value for a Java application could be 0.5 seconds. If your application is a collection of microservices certification that process transactions via APIs, the Apdex could be 0.2 seconds for each service. The idea is to set an appropriate Adex for every service that executes transactions.

Monitoring an application’s Apdex score.

3. Design and execute a load test that methodically increases the number of users. Throughput and user load goals are unique to each application. For example, you could start the load test with 5 concurrent users and add 5 more users every 15 seconds. As you increase the user count, your load test will slowly approach the point of performance degradation, which will give you an understanding of how much load your application can handle.

Pro tip: Be methodical in designing your load tests—don’t throw the target workload onto an application or you’ll be left with chaotic results that are difficult to interpret. So, for example, if your goal is to reach 5,000 concurrent users, design a load test to reach half that target. If the application scales successfully to the halved target load, then go ahead and design the next test to double the load.

Additionally, if you’re load testing throughput rather than users or active sessions, you can still use the same approach to softly reach the target number of transactions per second. For example, if the throughput goal of your API is 200 transactions per second, start with a load test that will scale to reach 100 transactions per second.

4. In the APM Overview page for your application, change the view to see Web transactions percentiles and concentrate on the 95% line, as 95% is more sensitive and granular than the median or average.

Tracking web transactions’ 95% percentile.

Highlight and zoom into the timeframe at a point just before the load test began to degrade. From this time span, you can perform deeper analysis (for example, dive into transaction traces, distributed traces, and errors), or switch from APM to Browser (for frontend to backend analysis) and New Relic automatically keeps this isolated timeframe in focus.

Pro tip: The key part of this test is identifying the first bottleneck. You don’t need to worry about what’s happening in the chart after the first buckle point—anything beyond that point is just a symptom that you should differentiate from root causes.

Part 2: Isolate the first bottleneck

As you troubleshoot the performance degradation, perform steps 5-9 below in whatever order makes the most sense in your situation. For example,…