What Is a Tail Log?

Learning Objectives

• Understand what a tail log is

• Learn the benefits of a tail log

• Learn about when to use tail log backups

• Understand how to make the most of tail log backups.

If you're a responsible and diligent engineer, you back up your databases regularly.

But what happens if you need to restore a database, but your most recent backup for that database no longer reflects the current database state? Do you have to accept some data loss and move on?

Not necessarily. With the help of tail logs backups, you can potentially restore data to its most current state, even if your latest backups don't reflect that state. This article explains how tail logs and tail log backups work, and why to incorporate them into your logging and backup strategy.

What Is a Tail Log?

A tail log is a log that contains recent data from a log file.

There's no specific definition of how recent that data needs to be, or how many events or transactions a log tail needs to include. Instead, think of log tails as a high-level concept that centers on the latest events in a log.

Note, too, that tail logs are different from log tailing. The latter refers to pulling the most recent entries from a log file using a tool like the Linux CLI utility "tail." Engineers often use log tailing as a quick way of viewing the most recent activity on a system. Tail logs contain this same data, but simply running "tail [your-log-file]" in the CLI doesn't produce a tail log.

The Benefits of Tail Logs

The main reason tail logs are helpful is that users can use them to reconstruct what has happened in a system since the last backup.

By tracing the series of events recorded in a log tail, you can make an equivalent set of changes to your backup data. The result is that the updated data will reflect the most current state of the system (or, at least, the most recent state recorded in your log files) rather than the most recent backup.

Tail Log Backup Example

For example, imagine that you have a database last backed up at 3 p.m. At 3:20 p.m., your database server goes down suddenly, but the log files remain available because you were streaming the database logs to an external log collector (like Mezmo, formerly LogDNA).

Suppose you were to restore the database based on the most recent backup. In that case, you'd permanently lose 20 minutes' worth of data, which could potentially create significant problems for your business or users. You could lose crucial financial transaction records, changes to user account data, and so on.

An alternative approach is to use tail logs to reconstruct the state of the database when it fails. To do this, you'd take the database backup from 3 p.m., then use the log files to determine which transactions took place on the database between 3 p.m. and 3:20 p.m. You avoid data loss within the database by modifying the data based on these transaction records.

When to Use Tail Log Backups

Tail log backup is most closely associated with the Microsoft SQL server, which introduced tail log backups as a feature in 2005. However, other SQL databases offer similar features, like MySQL point-in-time recovery and Oracle redo logs. If you are running a database that supports features like these, it's certainly wise to take advantage of them as a means of minimizing the risk of data loss.

More generally, it's theoretically possible to perform a tail log-based recovery of any system for which you have logs and sufficient data within the logs to reconstruct all system state changes that have taken place since the last backup happened. Even if you can't rebuild a system's state, tail log data could help you identify significant changes that don't appear in your latest backup.

For example, users could use a tail log from an authentication log to determine the authenticated users. A server log could indicate which processes started or stopped before a system failure.

Making the Most of Tail Log Backups

To use tail log backups reliably and efficiently, you need two things:

  1. A way to aggregate logs in a separate location from your production system so that log data will remain available if the production system fails.
  2. Ideally, a tool can automatically use tail logs to reconstruct the system state. Databases like Microsoft SQL can automate backups based on tail logs.

Continuous log aggregation should be a default component of your logging strategy. The more log data you aggregate, the greater your ability to recover critical data based on your logs if your backups won't suffice.

Conclusion

For systems that support it, tail logs backup is a handy technique for doing what can feel almost magical: recovering data that doesn't exist in your most recent backup files. Be sure to aggregate your logs into secure, reliable storage so that they can drive tail log backups if it becomes necessary.

It’s time to let data charge