Courses / Learning AWS Serverless / Part 4: Lambda / Labs

Optimizing Lambda Reserved Concurrency

This lab references the code in the aws-connectedcar-dotnet-serverless repository. If you're new to this course, see the introduction for information about setting up your workstation and getting the sample code.

In this lab we’re going to look at the impact of different reserved concurrency settings. This will require load testing, which means we need to calculate the traffic load that we’re targeting. The table below has some entries that demonstrate how we’re arriving at a metric that our load testing tool can replicate.

In a nutshell, we’re starting with a (made up) number of registered vehicles for the customers, then plugging in a number for what percentage are active at any given moment. When applied, this percentage yields the number of total active vehicles. We’re then plugging in another (made up) number for how frequently the vehicle head units query the “Get Events” endpoint, to arrive at a total number of requests per hour. From that number we arrive at the requests per second value, which is easier to track in the Postman Canary load testing tool.

Item	Value
Registered vehicles	300,000
Percent active at any point	0.04
Active vehicles	12,000
Vehicle requests per hour	4
Total requests per hour	48,000
Total requests per second	13.3

Testing Default Reserved Concurrency

Step 1: Check the reserved concurrency in the console

Our first test will use the default reserved concurrency setting that applies when we don’t otherwise specify a value in the templates. As you can see from the console below, in this case the default is the account-level concurrency limit of 1000. Note that for all the tests in this lab, we’re continuing with the performance-optimized memory size of 2048 MB for the GetEvents Lambda.

Step 2: Run the load test in Postman Canary

Now, run the load test in Postman Canary using the same settings as in the previous lab:

After ten minutes has elapsed and the load test has finished, you should see results that look something like the screen below. Note that the requests per second metric, shown at the top, is in the ballpark for our target:

Step 3: Graph the concurrent execution and API error metrics in the console

Now, go to the AWS console and run a couple of CloudWatch metrics queries. For the first query, use the AWS/Lambda namespace and the MAX(ConcurrentExecutions) metric, without any Function-level filter. Remember to set the time span at the top of this tab. Then, on the “Graphed metrics” tab, set the time sampling period to one minute. Having done so, you should see results similar to those shown below, where the number of concurrent executions drifts upward during the test:

For the second query, you need to check for API-level errors, using the AWS/ApiGateway namespace and the SUM(“5XXError”) metric. You should see a result indicating that there were no errors, as shown below:

Testing Limited Reserved Concurrency

The previous test was configured to let the number of concurrent Lambda executions grow with relatively unlimited headroom. Now we’ll see the impact of limiting the number of concurrent executions to a small number.

Step 4: Set the reserved concurrency property in VS Code

Open the vehicle.yaml template in VS Code and add the ReservedConcurrentExecutions property to the GetEvents Lambda resource, as shown below. For this test set the value for this property to two:

As always, once the deploy.zsh script has executed, double-check the value of this configuration setting in the console. You should see the updated value, as shown below:

Step 5: Run the load test in Postman Canary again

Now run the load test in Postman Canary once more. After ten minutes, you should see results similar to those shown below. Note the error rate metric shown at the top right, which has gone up to 4.05%:

Step 6: Graph the concurrent execution and API error metrics again

Switch back to the console and re-run the concurrent executions metric in CloudWatch. You should see the maximum value flatline at two, as shown below:

When you run the AWS/APIGateway SUM(“5xxError”) metric again, you should see a non-zero number of errors, as was reported on the client side in Postman:

Finally, it’s instructive to look at the AWS/Lambda SUM(Throttles) metric, as shown below. This shows the number of times that the Lambda invocations were blocked because there weren’t any available Lambda instances, given the reserved concurrency setting. Some of these requests were blocked long enough to timeout in API Gateway, which resulted in the error rate seen above.

Testing Optimized Reserved Concurrency

The previous test showed what happens when you apply a reserved concurrency setting that doesn’t provide enough Lambda instances to handle the incoming traffic. We’ll now try running a test that limits the reserved concurrency, but still provides enough Lambda instances to handle the load without errors.

Step 7: Set the reserved concurrency property in VS Code again

Once more, in VS Code, update the “ReservedConcurrencyExecution” property, this time to a value of eight:

Once again, after the deploy.zsh script has executed double-check the configuration in the console:

Step 8: Run the load test in Postman Canary again

Run the load test one more time, which after ten minutes should look like the result shown below, notably without any errors reported:

Step 9: Graph the concurrent execution and API error metrics again

Run the AWS/Lambda MAX(ConcurrentExecutions) metric query again, which should show a lower number of peak concurrent executions than we witnessed with the first load test:

The AWS/ApiGateway SUM(“5XXError”) metric should also show no errors, to match the result we saw in Postman:

And lastly, the AWS/Lambda SUM(Throttles) should show a value that’s either low or zero, as shown below:

Optimizing Scalability vs Account Limits

Unlike the previous lab there isn’t a trade-off here between performance and cost. Instead, for reserved concurrency you need to find the threshold at which you begin to observe throttling and errors for the anticipated traffic load, and apply a setting that is safely above it. You then need to be aware of your overall account limit, because you may need to increase it to provide enough aggregate reserved concurrency for all your deployed Lambdas.