Minimizing Cold Starts with Provisioned Concurrency
This lab references the code in the aws-connectedcar-dotnet-serverless repository. If you're new to this course, see the introduction for information about setting up your workstation and getting the sample code.
One of the solutions that AWS provides for the cold start problem is what it calls “provisioned concurrency”. With this feature you can deploy a specified number of pre-provisioned and pre-initialized instances for your Lambdas. Let’s see how this feature performs for the .Net version of the sample code.
Updating the Lambda Code & Configuration
Step 1: Add the ProvisionedConcurrencyConfig property to the GetDealers resource
To specify the number of instances you want pre-provisioned, you simply add the elements shown below to your target Lambda resource. For our test here, you can set the ProvisionedConcurrentExecutions to a value of one:
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 1
You can add these elements to the bottom of the GetDealers Lambda resource in the admin.yaml template, as shown below:
Step 2: Add a service call to DynamoDB in the Lambda default constructor
With the provisioned concurrency feature, Lambdas are not only provisioned but also initialized when they’re deployed. You therefore want to add a service call to a static initializer or the default constructor so that all the profiling, JIT compilation, and DynamoDB client initialization happens at this stage. As we saw in the previous lab, the initialization step can take as much time, if not more, than the provisioning step.
So, to make this happen add the call to the dealer service in the default constructor, as shown below on lines 16-18 of the AdminFunctions.cs class:
Step 3: Deploy the code and configuration updates
With these code and configuration updates in place, run the deploy.zsh script again. Once deployed, you should see the updated value under “Provisioned concurrency” column, associated with the alias, in the console as shown below:
Testing the Lambda with Provisioned Concurrency
Step 4: Send two requests for the Get Dealers Lambda in Postman
As you did in the previous lab, send two consecutive “Get Dealers” requests from Postman. As you can see below, the cold start duration measured in Postman is now down to less than 400ms:
As you would expect, the second request is in line with warm requests seen previously:
Step 5: Review the X-Ray traces for the two Get Dealers requests
Open the trace for the first request. As you can see, there’s still a slight delay compared to a fully warmed up Lambda, which may be the result of a first JIT compile pass versus subsequent passes, or operation-specific caching of DynamoDB metadata:
The second request matches up with previously seen warm requests:
Summarizing the Results
The good news with this feature is that it nearly eliminates cold starts for the configured Lambdas. The bad news is that this is made possible only by provisioning capacity, the avoidance of which is one of the benefits of going serverless. As this is written, the cost for provisioned concurrency is about half as much for duration, compared to un-provisioned Lambdas, but then costs about $10 per GB-month in addition. This extra cost would add up if applied for multiple concurrencies across a large number of multi-GB-sized Lambdas.
Of course, you can still stand up nearly free low-volume environments using Lambdas, and then tactically apply this feature only where needed for production. We should note as well that this feature doesn’t have any auto-scaling or scheduling capabilities. So you have to specify a fixed number of concurrent requests you want provisioned, and make this work for all times and conditions.