Best practice for high-performance JSON processing with Jackson


If you serialize POJO etc to JSON in Java, or deserialize it from JSON , using Jackson is no longer a golden standard today, but what about all these days of your Java JSON life?

In this time we need to deal  Jackson with more efficiently in terms of speed performance.


Best practice for high-performance JSON processing with Jackson

1. Reuse objects that are heavily generated, etc.
2. Let’s close what you need to close when you finish using it
3. Handle input / output objects as close to “raw” as possible
4. Change from default settings only if you really need it
5. Even when processing the same JSON many times, let’s stop the parsing of JSON only once
6. Use ObjectReader # readValues ​​() to read successive same POJOs
7. Let’s use ObjectReader / ObjectWriter rather than ObjectMapper


Basic

1. Reuse objects that are heavily generated, etc.

ObjectMapper used in data binding and JsonFactory used in streaming processing are objects to be reused, in particular. It is one of the reasons for reusing that these object generation is heavy.

In ObjectMapper, instances of serializers and deserializers for each type are cached internally. The object of this serializer / deserializer creates and caches the object at the timing to handle the target type first, and uses the cached object after the second time.

Therefore, if you create an ObjectMapper object each time JSON <-> POJO is converted, objects of the serializer / deserializer are generated every time without using this cache effectively.

Fortunately, neither ObjectMapper nor JsonFactory has the feature that it can handle thread safe. So let’s reuse these objects without fear of anything.

Objects of ObjectReader and ObjectWriter can also be reused, but when compared with the two above, the merit obtained by reuse is not great.


2. Let’s close what you need to close when you finish using it

Objects of JsonParser and JsonGenerator are recommended to call the close () method when the object becomes unnecessary, as you can see from the implementation of the Closeable interface.

Why should we call the close () method?

These objects hold internally reusable buffers and symbol tables and by releasing this buffer by calling the close () method, we can reuse it.


3. Handle input / output objects as close to “raw” as possible

Jackson provides a variety of interfaces to accept different kinds of I / O objects. However, when pursuing speed performance, it is recommended to handle I / O objects as close to “raw” as possible.

(i). In the case of deserialization  (JSON -> POJO), input objects should be handled in the following order to get best  efficiency listed as follows.

1. byte []
2. InputStream
3. Reader
4. String

JSON trying to deserialize is prepared in the state of byte [] … I think that it is not so much, but if you are processing a file in which JSON is recorded, you can write the FileInputStream object It is better to pass it to Jackson (ObjectMapper # readValue (InputStream) method).

import com.fasterxml.jackson.databind.ObjectMapper;import java.io.FileInputStream;import java.util.Map;public class JsonDeserializerDemo {    public static void main (String [] args) throws Exception {        try (FileInputStream is = new FileInputStream ("path / to / file.json"))) {            new ObjectMapper (). readValue (is, Map.class);        }    }}

The point here is that the FileInputStream object is not wrapped in the InputStreamReader class or the BufferedReader class. Decoding and buffering character codes is more efficient than JDK’s processing inside Jackson, let alone loading the contents of the file as a string before passing it to ObjectMapper is foolish.

 (ii) In the case of serialization, output objects should be handled in the following order, and so on.

1. OutputStream
2. Writer
3. String

Again, it is said that it is not good to receive the serialization result as a string. However, even if you record serialized results in a log file via an interface such as SLF 4J … In what case, I think that there is no other choice but to stringify.


4. Change from default settings only if you really need it

Jackson has a large selection of options that can be turned ON / OFF, but the default options are set so that the speed performance will be sufficient in the initial state. Especially, it is necessary to explicitly enable the option which decreases the speed performance.


5. Even when processing the same JSON many times, let’s stop the parsing of JSON only once

Consider a situation where you want to process that JSON step by step for one JSON. For example, suppose that the structure of a JSON consists of two sections: a section defining the formatting of data and a section of data itself.

In this case, a two-step processing configuration is adopted in which only the section in which the format is defined is processed first, then the data is deserialized with reference to the format information, but the same JSON is dealt with in each step It is useless to serialize or serialize only the remaining part after processing in the previous stage to JSON like intermediate data.

Jackson not only maps JSON to POJO, it also has a function to express JSON as intermediate data. Especially when you can process JSON components streamwise it is better to use the TokenBuffer class. On the other hand, if you want to manipulate JSON tree structure, you should use JsonNode class. Although the use of JsonNode class seems to be somewhat inferior to the TokenBuffer class in terms of speed performance, it still has better performance than deserializing the same JSON twice three times.


6. Use ObjectReader # readValues ​​() to read successive same POJOs

It is more efficient than calling the ObjectReader # readValue () method multiple times.


7. Let’s use ObjectReader / ObjectWriter rather than ObjectMapper

Both ObjectReader and ObjectWriter are thread-safe and are a bit more efficient than using ObjectMapper thanks to avoiding the search process as ObjectMapper handles internally.