Creating JSON Formatted Logs in Java
If you are using Splunk for logfile analysis, you should also be using a properly formatted output format that allows for the reliable transmission of key/value pairs from your application. JSON is perfect for this and Logstash has created a convenient Logback encoder for us to use. This blog will show you how to
- Configure Maven and Logback
- Use MDC to log key/value pairs as JSON
- Use try-with-resources to manage the cleanup of key/value pairs in MDC
- Use StructuredArguments to log key/value pairs as JSON
Logstash Logback JSON Encoder
It’s notable that this is the format favored by the ELK Community. Representing the middle letter of this acronym, LogStash has provided open source Logback encoders and layouts which are perfectly suited for JSON-based logging from Java. This is what we’ll use.
Here’s the Maven dependency you’ll need to add to Maven pom.xml:
Configuring Logback JSON Encoder
After trying a few different configurations, I settled on a configuration using the
net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder as the primary encoder for Logback.
Here is the basic configuration I used for testing:
The basic idea here is that we are using a
RollingFileAppender to manage the output.log file. The
LoggingEventCompositeJsonEncoder delegates most of the work to different “providers” which are responsible for the rendering of available information as JSON. The positioning of these provider tags corresponds to the positioning of the JSON value pairs in the logging output.
Running the above configuration has produced output similar to:
Using MDC for Reporting Thread-specific Context
One of the most important goals for JSON-based logging is the ability to associate user and network specific information with our typical auditing and tracing output. Although Logback is providing us with information about our Java execution environment, other information can be introduced into the Mapped Diagnostic Context (or MDC) that will remain associated with the currently executing thread until removed.
This means a few things:
- Values in the MDC are automatically added to the JSON as value pairs. This is a huge win because we will get JSON without changing our existing logging code.
- Values in the MDC will need to be explicitly removed when they are no longer valid. This ensures they aren’t perpetuated within a thread pool. There are strategies for this.
- Values in the MDC are not thread-safe and can be replaced or removed if you are not careful. Let’s not overuse this very helpful feature.
Here is a basic example of MDC in action:
There’s nothing special we had to do other than make sure the keypair for “foo” is added to the MDC and then later removed. It’s the
LoggingEventCompositeJsonEncoder which automatically outputs the contents of the MDC as JSON which tags the subsequent logs with this information.
Useful MDC Cleanup Trick
You might not trust that your MDC-introduced key/value pairs will be cleaned up in a timely manner after you are through with them. Starting with JDK 1.7, we can use a try-with-resources block to declare the values we are adding to MDC and that they be auto-released when the block exits.
Here’s a quick example:
In this example, the try with resources syntax allows us to both declare and add key/value pairs to the MDC. Then we execute code within the block which automatically logs the values we added. When the block is exited, the MDC values we added are removed.
Logging Structured Arguments
Included with the dependency on logstash-logback-encoder are Structured Arguments. Although SLF4J already has log message formatting using an argument list, this subtle extension provides some important benefits:
- Logger arguments are associated with a named key and automatically formatted as “key=value” in the log message.
- Logger arguments that are unused by the message format are still output to the JSON.
- You don’t need to manage the MDC to temporarily insert key/value pairs into the logging output.
Logstash provides the static method
StructuredArguments.kv which converts key/value pairs into an Object which is passed as a parameter to SLF4J. This effectively allows us to assign unique keys to the parameters passed to SLF4J logger methods, and have them correctly output in context.
Here’s an example of how it’s done:
Because Logstash now sees these arguments as StructuredArguments, it knows to add them as key/values in the JSON. The output looks like this:
You can see that the first two arguments,
bar, are output in the message format as
Things happened: foo=FOO, bar=BAR, but the value for
bang is not used. However all three values are represented as separate JSON key/value pairs:
A convenient side-effect is that the ordering of the arguments is less important becaue the key names don’t need to be ordered in the message format.