Debugging Services with CloudWatch Logs
This lab references the scripts in the aws-connectedcar-common repository. If you're new to this course, see the introduction for information about setting up your workstation and getting the sample code.
The labs in this section cover the use of the three core CloudWatch services. In this first lab we’ll work with the CloudWatch logs service, showing how you can view log entries in the console, and query and tail logs from the command line.
Note that we’re continuing with the code deployment and Postman setup that we used in the labs in the previous section. If you tore down that deployment, then circle back to the first lab in the CloudFormation section and run through steps 1 to 6 to redeploy the sample code. Then, from the second lab, run through steps 1 to 3 again to configure Postman.
Setting a CloudWatch Logs Service Role for API Gateway
By default, API Gateway doesn't have permission to write to CloudWatch logs. So our first step in this lab is to create a role for the service to assume that will have these permissions, and then apply this role at the account level.
Step 1: Create and apply a service role for API Gateway in the console
First, open the AWS console and navigate to the IAM service, then select the “Roles” option on the left. You should see a page like the one shown below that lists all the available IAM roles for your account:
On this page, click the "Create role" button at the top left. You'll be taken to the page shown below:
On this page, select "AWS Service" as the trusted entity type, then select "API Gateway" in the drop down for the service. Click the "Next" button at the bottom of the page, which will direct you to the "Add Permissions" page, shown below:
By default this page will show a compatible built-in access policy, which is the "APIGatewayPushToCloudWatchLogs" policy. This is the policy that you want, so you can click "Next" again. This will take you to the page shown below, where you can enter a name for the new role:
Once named, you can save this new role. Then, from the IAM roles page, click to select this new role.
On the properties page for this role, copy the ARN that's shown at the top right. Then navigate to the API Gateway service and select the "Settings" option at the bottom left. Note that his option is only shown when there is a selected API, even though these are account level settings.
The settings page should show that there's no CloudWatch log role ARN applied. Click the "Edit" button at the top right.
Now paste the ARN for the new service role you previously created, and save the changes. You will now have a service role assigned to API Gateway that grants permissions for logging.
Checking the CloudWatch Configuration for the Admin API
Next, let's check the CloudWatch configuration for the Admin API that we've seen in the CloudFormation templates.
Step 2: Confirm that logging, metrics and tracing are all enabled for the stage
Select the “Stages” option on the left, then select the “api” stage followed by the “Logs/Tracing” tab. You should see the “CloudWatch Logs” dropdown set to “Full Request and Response logs”, and the “EnableDetailedCloudWatchMetrics” checkbox set to true. The “Enable X-Ray Tracing” checkbox should also be selected:
Generating Sample Logs Data
Now you’re ready to generate some sample logs data. You’ll use Postman to send a valid “Create Dealer” request, followed by an invalid “Create Dealer” request, and then a couple of invalid “Get Dealers” requests. These invalid requests will generate sample entries in the CloudWatch logs for API Gateway, as well as the second Lambda, that we can then search for.
Step 3: Send a valid Create Dealer request from Postman
Open Postman, select your workspace, open the “Admin_API” folder, then the “Dealers” folder, and then select the “Create Dealer” test. Lastly, select the “Body” tab. You should see the JSON-formatted request body, like this:
Click the “Send” button at the top-right. Following this, you should get a “201 Created” response, as shown below:
Step 4: Send an invalid Create Dealer request from Postman
Staying with the same “Create Dealer” test, remove the “streetAddress” field on line 4 of the request body, and send the request again. This time, you should see a “400 Bad Request” response, like you see below:
Once you’ve sent this request, press Ctrl-Z or Cmd-Z to undo the deletion of the “streetAddress”: “123 Main Street” field from this test.
Step 5: Send two invalid Get Dealers requests from Postman
Next, we’ll try sending some more invalid requests. We’ll do this with the “Get Dealers” test in Postman. First, send a valid request, which should return a “200 OK” response, as shown below:
Now, change the “stateCode” query parameter value in the request URL to something that’s invalid, such as “XX”. Send the request again, which should result in another “400 Bad Request” result, like this:
Finally, remove the “stateCode” query parameter entirely, and send the request one more time. As before, you should see a “400 Bad Request” response, as you can see below:
Before we look at the logs data from these requests, take a moment to set the Get Dealers request URL back to its original value, by adding back the “?stateCode=OR” query parameter.
Viewing Logs Data in the Console
Now, let’s view the error data in the logs that we’ve generated, starting with the Console.
Step 6: View API Gateway logs in the console
First, find the API ID for the Admin API, either from the console, or by using the query-outputs.zsh script we worked with in the previous section. Once you have this value, go to the CloudWatch service in the console, open the “Logs” menu on the left, and select the “Log groups” option. Then paste the API ID into the “Filter log groups” text box. You should then see the log group for the API in the results, as shown below:
Click the log group, scroll down the page slightly, and you should see one or more log streams. You can drill down into these streams, and from there into individual log entries. Here’s an example log stream for the above log group with the default “Display” option. As you can see, at the logging level that we have configured, there are a lot of discrete log entries for API Gateway:
If you click the “Display” drop-down at the top-right and select “View in plain text”, you’ll see the same log stream in a slightly more readable format:
Step 7: Search for text across log groups in the console
The log events page shown above has a “Filter events” text field at the top. This lets you search the logs, but only for the selected log stream. A better technique is to search for a text phrase across all the log streams for a group. To do this, navigate back to the page for the log group, then click the orange “Search log group” button at the top-right. You’ll be directed to a search page, where you can enter your text phrase. Since the results are not necessarily sorted by time, it sometimes helps to also narrow the time span of the search.
Here’s a search that captures two “Create Dealer” request bodies from the previous steps. The search function finds these request bodies by filtering log entries that contain the “request body before transformations” phrase:
In the two JSON request bodies shown above, you can quickly determine that the second request is missing the “streetAddress” field. This shows that when you have API Gateway logging configured to record full requests and responses, the CloudWatch logs can be helpful way to see the actual data that’s sent to an endpoint by a client application. (At least in a non-prod environment where you’re not worried about logging sensitive data).
Hint: the reason we’re searching for the phrase “request body before transformations” is because API Gateway transforms these external REST API requests to internal Lambda invocation requests before invoking the target Lambda. You can also search for “request body after transformations” to see what the Lambda invocation requests look like.
Step 8: Search for a Lambda exception stack trace in the console
Back a couple of steps, we deliberately sent a couple of invalid requests using the “Get Dealers” test. Both of these invalid requests will have resulted in exceptions being thrown in the GetDealers Lambda for the Admin API, then caught and logged. Let’s search for these log entries.
This time, back on the main “Log groups” page, search for “GetDealers”. This should return a single log group, like this:
Next, click on the log group, and then click on the orange “Search log group” button. Enter “stack trace” in the search field, press enter, and click the “Display” button and select “View in plain text”. You should see results that look something like this:
The stack traces in these log entries are truncated, but there’s enough information here to quickly debug the problems. If you go back to VS Code and open the BaseRequestFunctions class, you’ll see what’s on line 143 from the first log entry:
Viewing Logs Data from the Command Line
The console works fine for ad-hoc log searches. If you're developing a new service, however, and you need to repeatedly dig into specific log groups to see what’s going on, you'll probably find the command line option to be more effective.
Step 9: Run the get-logs.zsh script in the terminal
What you see below is the get-logs.zsh script from the CloudWatch sample scripts folder. It’s using techniques that we covered in the previous section to query the stack outputs to determine the Admin API ID, and then construct the arguments needed for the “logs filter-log-events” command. The other thing to point out is that the date command used on line 5 doesn’t return milliseconds on MacOS. Hence the addition of the three zeros in the “—start-time” argument on line 17.
The AWS documentation for this command can be found here: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/logs/filter-log-events.html
For comparison, search the API Gateway log group in the console with a time parameter that returns the results you want. Here’s an example, with a one-day time criteria to match the script, which is returning four log entries:
Now run the get-logs.zsh script, and you should see the same results as in the console, as shown below:
Step 10: Run the tail-logs.zsh script in the terminal
The last AWS logs-related topic to cover here is the “logs tail” command. To demonstrate, here’s the tail-logs.zsh script, also from the CloudFormation samples folder:
This script is written to tail the CloudWatch log stream for the CreateDealer lambda, filtering for the “stack trace” phrase. To see this in action, open a terminal window and run the script. Next, open the “Create Dealer” request again in the “Admin_API” collection in Postman, and once more, remove the “streetAddress” field. After sending the request, the terminal will almost immediately display the stack trace for the exception in the Lambda.
Here’s an example terminal output, after the invalid request was submitted:
Note that, unusually, there doesn’t appear to be an equivalent PowerShell command for this. As a result there isn’t a corresponding tail-logs.ps1 script.