Serverless Computing: Challenges with Log Monitoring & Analysis
7.25.18
Serverless computing is a relatively new trend with big implications for software companies. Teams can now deploy code directly to a platform without having to manage the underlying infrastructure, operating system, or software. While this is great for developers, it introduces some significant challenges in monitoring and logging applications.This post explores both the challenges in logging serverless computing applications, and techniques for effective cloud logging.
What is Serverless Computing, and How is it Different?
In many ways, serverless computing is the next logical step in cloud computing. As containers have shown, splitting applications into lightweight units helps DevOps teams build and deploy code faster. Serverless takes this a step further by abstracting away the underlying infrastructure, allowing DevOps to deploy code without having to configure the environment that it runs on. Unlike containers or virtual machines, the platform provider manages the environment, provisions resources, and starts or stops the application when needed.For this post, we'll focus on a type of serverless computing called Functions as a Service (FaaS). In FaaS, applications consist of individual functions performing specific tasks. Like containers, functions are independent, quick to start and stop, and can run in a distributed environment. Functions can be replaced, scaled, or removed without impacting the rest of the application. And unlike a virtual machine or even a container, functions don't have to be active to respond to a request. In many ways, they're more like event handlers than a continuously running service.With so many moving pieces, how do you log information in a meaningful way? The challenges include:
- Collecting logs
- Contextualizing logs
- Centralizing logs
Collecting Logs
Serverless platforms such as AWS Lambda and Google Cloud Functions provide two types of logs: request logs and application logs.Request logs contain information about each request that accesses your serverless application. They can also contain information about the platform itself, such as its execution time and resource usage. In App Engine, Google's earliest serverless product, this includes information about the client who initiated the request as well as information about the function itself. Request logs also contain unique identifiers for the function instance that handled the request, which is important for adding context to logs. Application logs are those generated by your application code. Any messages written to stdout or stderr are automatically collected by the serverless computing platform. Depending on your platform, these messages are then streamed to a logging service where you can store them or forward them to another service.Although it may seem unnecessary to collect both types of logs, doing so will give you a complete view of your application. Request logs provide a high-level view of each request over the course of its lifetime, while application logs provide a low-level view of the request during each function invocation. This makes it easier to trace events and troubleshoot problems across your application.
Contextualizing Events
Context is especially important for serverless applications. Because each function is its own independent unit, logging just from the application won't give you all the information necessary to resolve a problem. For example, if a single request spans multiple functions, filtering through logs to find messages related to that request can quickly become difficult and cumbersome.Request logs already store unique identifiers for new requests, but application logs likely won't contain this information. Lambda allows functions to access information about the platform at runtime using a context object. This object lets you access information such as the current process instance and the request that invoked the process directly from your code.For example, this script uses the context object to include the current request ID in a log:import logging
def my_function(event, context):
logging.info("Function invoked.",
extra={"request_id": context.aws_request_id})In addition, gateway services such as Amazon API Gateway often create separate request IDs for each request that enters the platform. This lets you correlate log messages not only by function instance, but for the entire call to your application. This makes it much easier to trace requests that involve multiple functions or even other platform services.
Centralizing Logs
The decentralized nature of serverless applications makes collecting and centralizing logs all the more important. Centralized log management makes it easier to aggregate, parse, filter, and sort through log data. However, serverless applications have two very important limitations:
- Functions generally have read-only filesystems, making it impossible to store or even cache logs locally
- Using a logging framework to send logs over the network could add latency and incur additional usage charges
Because of this, many serverless computing platforms automatically ingest logs and forward them to a logging service on the same platform. Lambda and Cloud Functions both detect logs written to stdout and stderr and forward them to AWS CloudWatch Logs and Stackdriver Logs respectively, without any additional configuration. You can then stream these logs to another service such as LogDNA for more advanced searching, filtering, alerting, and graphing. This eliminates the need for complex logging configurations or separate log forwarding services.
Conclusion
Although the serverless computing is very different from those preceding it, most of our current logging best practices still apply. The main difference is in how logs are collected and the information they contain about the underlying platform. As the technology matures, we will likely see new best practices emerge for logging serverless applications.