Creating JSON Formatted Logs in Java

If you are using Splunk for logfile analysis, you should also be using a properly formatted output format that allows for the reliable transmission of key/value pairs from your application. JSON is perfect for this and Logstash has created a convenient Logback encoder for us to use. This blog will show you how to

  • Configure Maven and Logback
  • Use MDC to log key/value pairs as JSON
  • Use try-with-resources to manage the cleanup of key/value pairs in MDC
  • Use StructuredArguments to log key/value pairs as JSON

Logstash Logback JSON Encoder

All of our logging is ultimately fed to Splunk which isn’t too finicky about what it eats. Their own guidelines recommend, “Use clear key-value pairs” while citing an example that clearly doesn’t use any kind of quoting strategy. They also say, “Developer-friendly formats like JavaScript Object Notation (JSON) are readable by humans and machines. Searches through structured data are even easier with the spath search command.” The case for JSON-based logs is easily made.

It’s notable that this is the format favored by the ELK Community. Representing the middle letter of this acronym, LogStash has provided open source Logback encoders and layouts which are perfectly suited for JSON-based logging from Java. This is what we’ll use.

Here’s the Maven dependency you’ll need to add to Maven pom.xml:

<dependency>
   <groupId>net.logstash.logback</groupId>
   <artifactId>logstash-logback-encoder</artifactId>
   <version>4.11</version>
</dependency>

Configuring Logback JSON Encoder

After trying a few different configurations, I settled on a configuration using the net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder as the primary encoder for Logback.

Here is the basic configuration I used for testing:

<configuration>

  <appender name="STDOUT" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <File>output.log</File>

    <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
      <maxIndex>1</maxIndex>
      <fileNamePattern>output.log.%i</fileNamePattern>
    </rollingPolicy>

    <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
      <MaxFileSize>1MB</MaxFileSize>
    </triggeringPolicy>

    <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
      <providers>
        <timestamp>
          <timeZone>UTC</timeZone>
        </timestamp>
        <callerData>
          <classFieldName>classname</classFieldName>
          <methodFieldName>method</methodFieldName>
          <fileFieldName>file</fileFieldName>
          <lineFieldName>line</lineFieldName>
        </callerData>
        <threadName>
        <fieldName>thread</fieldName>
        </threadName>
        <loggerName/>
        <logLevel/>
        <message/>
        <mdc/>
        <arguments>
          <includeNonStructuredArguments>true</includeNonStructuredArguments>
        </arguments>
        <logstashMarkers/>
        <stackTrace/>
      </providers>
    </encoder>
  </appender>

  <root level="TRACE">
    <appender-ref ref="STDOUT" />
  </root>

</configuration>

The basic idea here is that we are using a RollingFileAppender to manage the output.log file. The LoggingEventCompositeJsonEncoder delegates most of the work to different “providers” which are responsible for the rendering of available information as JSON. The positioning of these provider tags corresponds to the positioning of the JSON value pairs in the logging output.

Running the above configuration has produced output similar to:

	

{"@timestamp":"2017-06-13T18:41:55.559+00:00","classname":"SplunkingAgent","method":"test","file":"SplunkingAgent.java","line":77,"thread":"main","logger_name":"TestWithResources","level":"DEBUG","message":"Debugging 1, 2, 3","beta":"BBB","foo":"bar","alpha":"AAA"}
{"@timestamp":"2017-06-13T18:41:55.559+00:00","classname":"SplunkingAgent","method":"test","file":"SplunkingAgent.java","line":78,"thread":"main","logger_name":"TestWithResources","level":"TRACE","message":"Tracing 1, 2, 3","beta":"BBB","foo":"bar","alpha":"AAA"}
{"@timestamp":"2017-06-13T18:41:55.559+00:00","classname":"SplunkingAgent","method":"test","file":"SplunkingAgent.java","line":79,"thread":"main","logger_name":"TestWithResources","level":"WARN","message":"Warning 1, 2, 3","beta":"BBB","foo":"bar","alpha":"AAA"}
{"@timestamp":"2017-06-13T18:41:55.559+00:00","classname":"SplunkingAgent","method":"test","file":"SplunkingAgent.java","line":80,"thread":"main","logger_name":"TestWithResources","level":"ERROR","message":"Error 1, 2, 3","beta":"BBB","foo":"bar","alpha":"AAA"}

Using MDC for Reporting Thread-specific Context

One of the most important goals for JSON-based logging is the ability to associate user and network specific information with our typical auditing and tracing output. Although Logback is providing us with information about our Java execution environment, other information can be introduced into the Mapped Diagnostic Context (or MDC) that will remain associated with the currently executing thread until removed.

This means a few things:

  • Values in the MDC are automatically added to the JSON as value pairs. This is a huge win because we will get JSON without changing our existing logging code.
  • Values in the MDC will need to be explicitly removed when they are no longer valid. This ensures they aren’t perpetuated within a thread pool. There are strategies for this.
  • Values in the MDC are not thread-safe and can be replaced or removed if you are not careful. Let’s not overuse this very helpful feature.

Here is a basic example of MDC in action:

	

public class MDCTest
{
	public void main(String[] args)
	{
		MDC.put("foo", "bar");
			
		testLogging();

		MDC.remove("foo");
	}

	private void testLogging()
	{
		Logger log = LoggerFactory.getLogger("testLogging");
			
		log.info("Info 1, 2, 3");
	}
}

There’s nothing special we had to do other than make sure the keypair for “foo” is added to the MDC and then later removed. It’s the LoggingEventCompositeJsonEncoder which automatically outputs the contents of the MDC as JSON which tags the subsequent logs with this information.

Useful MDC Cleanup Trick

You might not trust that your MDC-introduced key/value pairs will be cleaned up in a timely manner after you are through with them. Starting with JDK 1.7, we can use a try-with-resources block to declare the values we are adding to MDC and that they be auto-released when the block exits.

Here’s a quick example:

	

private void testTryWithResources()
{
	Logger log = LoggerFactory.getLogger("testTryWithResources");
	
	try (
		MDC.MDCCloseable alpha = MDC.putCloseable("alpha", "AAA");
		MDC.MDCCloseable beta = MDC.putCloseable("beta", "BBB");
	) {
		log.info("Info 1, 2, 3");
		log.debug("Debugging 1, 2, 3");
		log.trace("Tracing 1, 2, 3");
		log.warn("Warning 1, 2, 3");
		log.error("Error 1, 2, 3");
	}
}

In this example, the try with resources syntax allows us to both declare and add key/value pairs to the MDC. Then we execute code within the block which automatically logs the values we added. When the block is exited, the MDC values we added are removed.

Logging Structured Arguments

Included with the dependency on logstash-logback-encoder are Structured Arguments. Although SLF4J already has log message formatting using an argument list, this subtle extension provides some important benefits:

  • Logger arguments are associated with a named key and automatically formatted as “key=value” in the log message.
  • Logger arguments that are unused by the message format are still output to the JSON.
  • You don’t need to manage the MDC to temporarily insert key/value pairs into the logging output.

Logstash provides the static method StructuredArguments.kv which converts key/value pairs into an Object which is passed as a parameter to SLF4J. This effectively allows us to assign unique keys to the parameters passed to SLF4J logger methods, and have them correctly output in context.

Here’s an example of how it’s done:

	

import static net.logstash.logback.argument.StructuredArguments.kv;
...
private void testArguments()
{
	Logger log = LoggerFactory.getLogger("TestArguments");

	String foo = "FOO";
	String bar = "BAR";
	String bang = "BANG";

	log.info("Things happened: {}, {}", kv("foo", foo), kv("bar", bar), kv("bang", bang));
}
...

Because Logstash now sees these arguments as StructuredArguments, it knows to add them as key/values in the JSON. The output looks like this:

	
{"@timestamp":"2017-06-13T20:35:45.881+00:00","classname":"SplunkingAgent","method":"testArguments","file":"SplunkingAgent.java","line":112,"thread":"main","logger_name":"TestArguments","level":"INFO","message":"Things happened: foo=FOO, bar=BAR","foo":"FOO","bar":"BAR","bang":"BANG"}

You can see that the first two arguments, foo and bar, are output in the message format as Things happened: foo=FOO, bar=BAR, but the value for bang is not used. However all three values are represented as separate JSON key/value pairs: "foo":"FOO","bar":"BAR","bang":"BANG"

A convenient side-effect is that the ordering of the arguments is less important becaue the key names don’t need to be ordered in the message format.

References