Logging vs. APM
- Understand the purpose of APM
- Explore the reasons for logging
- Discover why APM and logging are important
Application monitoring is a must for any organization that relies on their technology for revenue, but application performance monitoring (APM) and logging are similar yet two different functions. Logging is a part of APM, but logging for general monitoring is a separate function from APM. Monitoring lets you know what’s going on with the application, and logs are the backend elements that support monitoring and metrics. Monitoring gives you analytical information on the health of your system while logging provides all the data and metrics necessary to analyze different health information such as a possible compromise, performance, uptime, error handling, and many more metrics critical to user experience and application performance.
What is APM?
Performance is a huge factor in a good user experience. User experience is largely responsible for customer satisfaction, acquisition, and retention. With so many options on the web, customers can simply move on to another site with better performance if your site is not up to par. Because of this potential customer loss, it’s common for organizations to rely heavily on application performance monitoring to determine if something needs to be done with code or server resources.
For administrators responsible for reliability and performance of a system, pinpointing the reason why a server runs slowly can be much more difficult than a developer debugging an error. Slowness can stem from numerous reasons. The server might need more resources to support traffic. The web application could be the target of a coordinated distributed denial-of-service (DDoS), or it could be from an unoptimized database query or code function. Applications that use APIs could suffer from performance degradation if the API is slow. The issue could even stem from poor network performance and bandwidth exhaustion internally. There could be so many issues unrelated to the application that affect performance, which makes the issue of slowness difficult to perform root-cause analysis. This is why monitoring performance helps with mitigation and remediation of performance issues.
What is logging and log management?
To determine the performance health of your application, you need logs and a log management solution that collects information from various resources that host the application and its resources. The health of your application is more than just performance, which is why logging is critical to monitoring every aspect of your application.
Logging involves writing an event to a file or database to create a record of application or server activity. Some examples of events are failed authentication attempts, errors thrown by the application, timeouts, spikes in server resources, changed server environment variables, or general information about the application such as POST or GET requests. These examples are just a few logged events that could be contained in files.
Most organizations have more than one application and network resource, so there could be dozens of log files from dozens of different applications and hardware. To perform analytics, you need a log management solution that aggregates files and their information and centralizes where analytics software reads and processes them. For example, a security information and event management (SIEM) application collects these logs and analyzes them for any anomalies so that security analysts can identify potential threats or ongoing attacks.
Security and organization of logs are components in good log management. The data contained in log files can be a wealth of information for attackers. It’s also important to note that logs should not contain sensitive information such as passwords or personally identifiable information (PII). They should log a unique identifier that represents the customer, but never contain sensitive data that could be used in a data breach.
Logs do more than just give you current information about your application. They also provide a historical archive for audits and investigations. Several compliance regulations require an audit trail of actions taken within the application mainly during access of private data. For example, HIPAA regulations require that any user access to a patient record be logged as part of an audit trail on protected health information (PHI).
Why is APM and logging important?
Most software development lifecycles (SDLC) have a testing phase where the application is deployed to a staging environment. Automated testing and human reviewers work on the staging environment to find any anomalies in the application’s performance and user experience. However, these tests don’t often fully represent the environment in a heavy traffic production environment. Logging for the purpose of monitoring performance is necessary so that any issues can be detected quickly and remediated before they affect revenue.
Without APM, the application could be slow without the developer’s knowledge. For example, maintainers of applications with a worldwide user base could be unaware of performance issues experienced by users in a geographic location distant from the developers, while local developers experience no latency at all. APM tells developers that users experience slowness in these geolocations so that they can remediate the issue.
The logs themselves are also important for several reasons. They are the foundation for monitoring and identifying anomalies. They can be used in an investigation during incident response, but more importantly they are necessary for compliance. Any organization that must adhere to specific compliance requirements needs a logging solution that provides information during an investigation. Using a single logging solution provides an easy way for organizations to stay compliant and avoid hefty fines for non-compliance.
Here are several reasons why APM and logging are important:
- Identify if your server requires additional resources to handle application traffic.
- Identify external issues that could be causing performance degradation such as poorly optimized SQL queries or network latency.
- Determine if specific geolocated users experience slowness.
- Identify peak seasonal traffic to scale resources dynamically to handle busy times.
- Stay compliant with audit trails on data access.
- Use events to find application errors for remediation.
- Track server resource usage and spikes that could cause performance degradation.
- Track user activity including authentication requests and behavior patterns.