Optimizing Lambda Memory Size
This lab references the code in the aws-connectedcar-dotnet-serverless repository. If you're new to this course, see the introduction for information about setting up your workstation and getting the sample code.
This lab and the next will cover a pair of configuration settings for Lambdas that affect their performance, scalability, and cost. We’ll start in this lab with the memory size setting. One of the quirks of the Lambda service is that you can’t directly specify CPU allocation. Instead it increases in steps as you increase the specified memory size. We’ll demonstrate some additional wrinkles about this fact as we perform some tests on different memory size configurations.
Setting Up Postman for Basic Load Testing
The Canary release of Postman, which you’ll have installed during your workstation setup, includes a makeshift load testing feature that we’ll make use of here. If you’re used to working with a full-featured load testing tool like JMeter, then this is not a replacement. But, this feature serves our purpose by making it easy to scale up requests for multiple simulated users. We’ll use it in this lab to get some memory and CPU metrics that can be averaged over a number of requests. We’ll also use this feature in the next lab as a way to simulate a production load for a Lambda.
Step 1: Generate event test data with Postman
Our first step is to configure the global variables in Postman for the “Admin_API” and “Vehicle_API“ collections, and then run these both in sequence to seed some test data.
So, using the query-outputs.zsh and query-attributes.zsh scripts as previously, get values for the AdminApi, apiKey, and vehicleApi variables and set these in Postman. Then run the “Admin_API” collection. When complete, run the “Vehicle_API” collection. The results for the latter should look something like in the screen shown below:
Assuming the “Vehicle_API” collection runs successfully, run it again three more times to populate a series of event items.
Step 2: Set up a performance workspace in Postman Canary
Next, open the Postman Canary application and create a new “ConnectedCar_Performance” workspace. Then, in this workspace, import the “vehicle.postman_collection.json” and “postman_globals.json” files from the “/postman/performance” folder of the aws-connectedcar-common repository. The first import should create a subset of the “Vehicle_API” collection that contains only the one test, as shown below:
You also need to open the editor page for the global variables and copy in the corresponding values that you see below, from the original Postman application. For the “vin1” variable, make sure you copy the “Current value” from Postman into both the “Initial value” and “Current value” columns in Postman Canary:
Running a Baseline Performance Test
What we’ll do now is set the memory size for the GetEvents Lambda to 1024MB, and run an initial load test to set a baseline.
Step 3: Reset the memory size to the 1024MB baseline in VS Code
Open the vehicle.yaml template in VS Code, and on the same line that we added in the previous lab, set the MemorySize property to a value of 1024, as shown below:
Run the deploy.zsh script in the terminal, and once the outputs appear, go to the console to confirm the configuration update. The Memory property should look like this, under the “General configuration” option for the “ConnectedCar_Vehicle_GetEvents_Dev” Lambda:
Step 4: Run the load test in Postman Canary
You’re now ready to use Postman Canary to run a small load test against the GetEvents Lambda. In the new performance workspace, select the “Vehicle_API” collection at the left, then click on the “Run” button at the top-right of the screen. Then, on the “Runner” tab, click the “Performance” sub-tab on the panel at the right of the screen. You should now see the load testing options. You can keep the default number of virtual users and the test duration. Just add a two minute ramp up, as shown below:
Click the orange “Run” button and let the test execute for the next ten minutes. At the end you should see results like those shown below, when you select all four graph options at the bottom right:
Step 5: Capture the memory usage and CPU duration metrics
You’ve probably noticed that the load test screen in Postman Canary has some metrics displayed at the top. But these are client-side metrics that include over-the-wire latency, which is not what we’re looking for. Instead, let’s get all our metrics from the service instrumentation in CloudWatch.
First, define a query on the “All metrics” page to capture the average of the maximum memory used per request. To do this, select the “LambdaInsights” namespace, and the AVG(used_memory_max) metric name, as shown below. Assuming this is the only traffic in your account, you won’t need to filter for a specific Lambda:
There are still two more things to set. Select the “Number” option for the type of graph, at the top right. Then, to the left of that drop down, filter the time to only cover when the load test was running. Now click the orange “Run” button to execute the query. Once the results are displayed, switch to the “Graphed metrics” tab and set the “Period” to one minute. You should now see a result that looks something like this, indicating a maximum memory used of 149MB averaged across all the Lambda invocations:
Lastly, run a similar query, but for the AWS/Lambda namespace and the Duration metric. Your result for this query should look something like this, indicating an average processing time of 13.9 milliseconds:
Running a Minimum Memory Size Performance Test
Now we’re going to repeat that above process, but with the memory size property set to a minimum value based on the usage we just observed. The average of the maximum memory used in this test was 149MB. So, to include a safety margin, we’ll run a test using a configured memory size of 256MB.
Step 6: Configure the minimum memory size
Update the MemorySize property for the GetEvents Lambda in the vehicle.yaml template, setting it this time to 256MB. Then run the deploy.zsh script again. Once complete, confirm the value in the console, which should look like this:
Step 7: Run the load test in Postman Canary again
Now run the load test in Postman Canary again. After ten minutes you should have another set of graphed results that look something like this:
Step 8: Capture the memory usage and CPU duration metrics again
Using the same query arguments as before, capture the metric for the memory usage. Not unexpectedly, the result is the same as above, since the Lambda is processing the same requests for the same data:
Here’s an example result for the CPU duration query. This example shows that the CPU duration is slightly longer, but not by much:
Running a Larger Memory Size Performance Test
Now we’re going to repeat the process again, but with the memory size property set to a larger value. This time we’ll run a test using a memory size of 2048MB.
Step 9: Increase the configured memory size
Once more, update the MemorySize property for the GetEvents Lambda in the vehicle.yaml template, setting it to 2046MB. Run the deploy.zsh script, and confirm the value in the console, like this:
Step 10: Run the load test in Postman Canary again
And again, run the load test in Postman Canary. After ten minutes you should have another set of graphed results, like this:
Step 11: Capture the memory usage and CPU duration metrics again
One last time, using the same query arguments, capture the metrics for the memory usage and CPU duration. Here’s a sample capture of the memory usage metric:
And here’s an example capture for the CPU duration metric, which interestingly is half of that previously captured:
Optimizing for Performance vs Cost
As we noted at the start of this lab, there are a couple of interesting wrinkles to the way Lambdas are configured and priced for memory and CPU usage. First, the CPU that’s allocated increases with the memory size, according to an undocumented series of thresholds. The important thing to know about the CPU allocation is this: if your code is not multi-threaded, and it doesn’t need much memory, then allocating any more memory than what results in a single core vCPU allocation will not lead to any extra performance. You’re basically allocating (and paying for) extra CPU cores that your code can’t take advantage of.
The other odd wrinkle with Lambda pricing is that you pay for the duration of your Lambda request processing, not the actual CPU time used. So, if the memory size setting you’re using results in a fraction of an allocated CPU core, you’re technically paying for CPU time that you’re not using. As a result, there’s also a minimum memory size below which it doesn’t make sense to go. Knowing where these thresholds lie is the challenge. There are tables of vCPU vs memory size that people have compiled from testing. But the best way to optimize your Lambda configuration, ultimately, is to run your own tests.
The results from the three test runs in this lab also highlight the fact that you need to know what you’ll prioritize. Is it maximum performance? Or is it low cost? Looking at the results from our three test runs helps to illustrate the possible trade offs. For the results shown below, we’re taking into consideration the per-request pricing as of this moment, which is US$0.0000166667 per GB-second. We’re also assuming a monthly traffic volume of 34,560,000 requests (or 800/minute).
Baseline memory configuration
Item | Value |
---|---|
Memory | 1.024 GB |
Average Duration | 0.0139 seconds |
Monthly GB-seconds | 491913.216 |
Monthly Cost | $8.20 |
Minimum memory configuration
Item | Value |
---|---|
Memory | 0.256 GB |
Average Duration | 0.0148 seconds |
Monthly GB-seconds | 130940.928 |
Monthly Cost | $2.18 |
Large memory configuration
Item | Value |
---|---|
Memory | 2.048 GB |
Average Duration | 0.00734 seconds |
Monthly GB-seconds | 519516.9792 |
Monthly Cost | $8.66 |
As you can see from these three sets of results, the minimum memory configuration results in the lowest performance, with average processing time for the Lambda at 14.8 milliseconds. But this configuration also offers the lowest cost, at only $2.18 per month. At the other end of the spectrum, the large memory configuration processes requests in the Lambdas twice as quickly, but at almost four times the cost. The interesting result is the baseline, which essentially offers the low performance of the minimum memory configuration while costing nearly as much as the high performance, large memory configuration. The take away is that you very much need to do your own testing to find the optimum memory size for your Lambdas.