A Fast and Minimal JSON Parser for Java

In the RAP project, reading and writing JSON are critical operations, since the server processes and creates JSON messages for a large number of clients at a high rate. For this reason, we need something fast for this job. When we switched to JSON, we included the org.json parser, which is reasonably small but not famous for its performance.

There are many better JSON libraries out there, but most do much more than we need. We really only need a bare-bones parser that can read JSON into a simple Java representation and generate JSON from Java. As we like to keep the core library self-contained, we don’t want a dependency to an external JSON library.

One winter Sunday, I started to write a JSON parser just for the fun of it, and was quickly surprised how simple it is to parse JSON.

[ Looking for more useful tools? See our Eclipse Tools page. | Need expert advice for your project? Our Developer Support is here to resolve your questions. ]

Why is JSON parsing so easy? That’s because the first character of every token uniquely defines its type ('[' for an array, '"' for a string, 't' or 'f' for a boolean, and so forth). There’s no backtracking involved. It went so well that I decided to continue and create a JSON parser tailored to our needs. Which are:

  • Fast – we read and create so much JSON that the parser directly affects the server performance
  • Lightweight – it should deal with memory sparingly as we deal with lots of messages
  • Minimal – the less code the better, as we have to maintain it
  • Simple to use – we’ll expose the API for custom component developers, so it should be simple and clear
  • No dependencies – only Java 5

The result is called minimal-json and it’s already included in RAP. It is fast, lightweight, consists only of 10 classes and I hope it’s simple to use:

Usage

You can read a JSON object or array from a Reader or from a String:

JsonObject jsonObject = JsonObject.readFrom( reader );
JsonArray jsonArray = JsonArray.readFrom( string );

Once you have a JsonObject, you can access its contents using the get() method:

String name = jsonObject.get( "name" ).asString();
int age = jsonObject.get( "age" ).asInt(); // asLong(), asDouble(), ...

The elements of a JSON array can be accessed in a similar way:

String name = jsonArray.get( 0 ).asString();
int age = jsonArray.get( 1 ).asInt(); // asLong(), asDouble(), ...

As you can see, the get() method always returns an instance of JsonValue, which can then be transformed to the target type using asString(), asInt(), asDouble(), etc. There’s no automatic conversion to Java types, no instanceof needed. If you’re not sure about the type of a value you can check it using isString(), isNumber(), etc.

Nested arrays and objects can be accessed using asArray() and asObject():

JsonArray nestedArray = jsonObject.get( "items" ).asArray();

You can also iterate over the elements of an JsonArray and the names of a JsonObject, e.g.:

for( String name : jsonObject.names() ) {
  JsonValue value = jsonObject.get( name );
  ...
}

Writing JSON

A JsonObject or JsonArray can output JSON to a Writer or as a string using the toString() method. The JSON is currently not pretty-printed, formatting support might be added later.

jsonObject.writeTo( writer );
String json = jsonArray.toString();

To create a JsonObject or a JsonArray, use the add() methods that exist for the relevant types. These methods return the object instance to allow method chaining:

jsonObject = new JsonObject().add( "name", "John" ).add( "age", 23 );
jsonArray = new JsonArray().add( "John" ).add( 23 );

You may have noticed that also the object has an add() method instead of a put() or set(). That’s because the JsonObject stores and writes its members in the order they are added. It allows you to define the output order, it even allows you to add the same key twice, which is discouraged but not forbidden by the JSON RFC.

To replace an element in an array of object, you first have to remove() the old value and then add() the new one. A replace() method may be added later. However, JsonArray and JsonObject are designed to be containers for reading and writing JSON, not general purpose data structures.

Performance

I’ve compared the time required to read and write a typical RAP message with other popular parser implementations, namely org.json (20091211), Gson (2.2.2), Jackson (1.9.12), and JSON.simple (1.1). Disclaimer: This benchmark is restricted to our use case and my limited knowledge on the other libraries. It may be unfair as it ignores other use cases and perhaps better ways to use these libraries. However, I think these results show that minimal-json can take comparison with state-of-the-art parsers.


chart 31 A Fast and Minimal JSON Parser for Java
I also ran a number of micro-benchmarks using Google caliper while optimizing the implementation. One interesting detail was the choice of the data structure for the JsonObject. Since JSON objects are key-value maps, a HashMap seems like an obvious choice. However, after some experiments, I ended up with two separate ArrayLists for names and values.

Of course, looking up a key in a list requires a linear search while a hash lookup is much quicker. However, creating a HashMap and adding elements has a considerable overhead. It turns out that this overhead squashes the benefits of a HashMap for very small numbers of items (< 10). Since JSON messages in RAP typically contain many objects with only a few items, a HashMap would even impair the overall performance. Moreover, the two ArrayLists require less than a third of the footprint of a HashMap.


chart 21 A Fast and Minimal JSON Parser for Java
I was able to improve the lookup performance for small item counts by adding a very small hash structure to the names list. This structure consists of a 32-elements byte array. It does not handle collisions but resorts to indexOf() in case of a miss. This version (shown in the middle of the chart above) seems to be a good compromise between HashMap and plain ArrayList. Optimizations for bigger JSON objects are possible, but not needed at the moment.

How to Use it

If you’re looking for a bare-bones JSON parser with zero dependencies, you are welcome to use minimal-json. It goes without saying that it’s developed test-driven and complies with the RFC. The code lives at github and is EPL-licensed. It is also included in RAP.

I didn’t setup a build. If you would like to use the code, I suggest that you simply copy these 10 Java files to your project.

Let me know if you find this useful, fork it on github, and feel free to open an issue if you find a problem.

12 Responses to “A Fast and Minimal JSON Parser for Java”

  1. Matthias says:

    Sounds awesome for parsing small JSON stuff. Thanks.
    I suppose converting to/from custom objects would break the “minimal” description.

  2. Jillles van Gurp says:

    Interesting findings. I’ve been doing my own json project on github on top of a json simple content handler.

    I currently use a LinkedHashMap for my object representation. Additionally, I’ve experimented with some memory efficient ways of storing strings. Basically, I store the utf-8 bytearray instead of the 16 bit java chars that a String holds. Additionally, I reuse instances for object keys. There’s a price for looking them up but it really cuts down on memory for scenarios where you only have a handful of different keys. I have some use cases where I cache millions of json objects in memory and cutting down on memory really helps reducing footprint.

    Your approach for using double array lists seems very interesting and I might give that a try. Also, your findings with different parsers suggest I might want to try something else than json simple.

    If you are interested, the project is here: https://github.com/jillesvangurp/jsonj

  3. Cowtowncoder says:

    It looks like this library can not read the usual InputStream (or even byte[]) input. This means that users will have to detect encoding using external means, as well as take bit of additional performance hit, which is not accounted for by performance tests.
    Same is true for writing JSON, assuming that one can only use Writer or produce a String — most real use cases need to deal with byte streams.

    This is a common oversight, but I hope it can be resolved to make this library more useful. Even if it just means convenience methods for reading/writing UTF-8 (default encoding for JSON) encoded JSON.

    On using two lists: that is one option, as would be use of immutable Maps for small number of entries (functional style). But for real storage savings this is not nearly as compact as using POJOs and actual databinding. So while it is good to advertise compact _Tree_ representation, it would be good to mention that Trees are rather inefficient model for JSON content; similar to how Java POJOs consume much less memory (and are much faster to operate on) than java.util.Maps and Lists.

  4. Ralf Sternberg says:

    @Matthias Yes, actually the problem with converting types is that the corresponding Java type is not always obvious. For example, would you convert “2147483647″ into Integer, but “2147483648″ into Long? What about “1.2e500″, which is valid JSON but doesn’t fit into a Double? I don’t want to make these decisions. There are good JSON libraries out there that convert JSON into a given Java model. My approach requires the developer to know about the JSON format and expected types.

    Another issue with conversion is that you’d end up with Object as the least common denominator, and a lot of instanceof checks and typecasts in your code.

  5. Ralf Sternberg says:

    @Jilles thanks for the link, that’s interesting. I’ll surely have a look at the UTF-8 byte arrays. When non-ASCII characters are rare this method should cut the footprint of the strings into half. However, since for our use case, time is much more critical than memory, it probably doesn’t make sense for us.

    @Cowtowncoder We could add read/write methods for streams, but since the parser reads and writes characters, these would create an InputStreamReader / OutputStreamWriter anyway. Adding those methods would probably mean to add one with charset and one that defaults to UTF-8. But then again, wouldn’t this be confusing, since the monadic Reader constructors in the class library fall back to the default system encoding, not to UTF-8? I wonder if this is worth the effort, but I’m happy to accept pull requests ;-)

    I don’t get your suggestion with the “immutable Maps for small number of entries (functional style)”. Could you elaborate this?

    You are certainly right about trees being a rather inefficient model for JSON compared to POJOs. However, that’s our use case. It’s not my intention to compete with other JSON parsers, only to create a good and simple parser for this simple use case.

  6. Aaron Digulla says:

    Suggestion for an improvment: I hate String ids. In my own code, I do this:

    StringOption NAME = new StringOption( “name” );
    String name = jsonObject.get( NAME );
    IntOption AGE = new IntOption( “age” );
    int age = jsonObject.get( AGE );

    This works with a mix of generics and method overrides. In your case, method overrides should be enough. It also allows type checking when setting values.

    Since most IDs are constants, I can define them in a central place. As an additional bonus, I can easily find all places in the code where I’m accessing certain values.

  7. Ralf Sternberg says:

    Thanks, that’s an interesting suggestion! This approach allows to keep the exception handling at a central place. However, since I intend to keep the parser really minimal, I’d prefer to keep these classes completely separated from the parser. For example, The XxxOption types could have a method to retrieve the value from a given JsonObject:

    int age = AGE.getFrom( jsonObject );

    For primitive types, how would you distinguish missing values from default ones? For example, what would be returned by

    int age = jsonObject.get( AGE );

    in case the object does not contain a member “age”?

  8. Aaron Digulla says:

    Re keeping the parser minimal: That makes *your* life easier and make it more horrible for thousands of people … :-) Creating an easy-to-use API is more important than having a technically perfect API since consumers never care about perfection. They don’t feel pain if your code is a mess, they only feel the pains of their own code.

    That said, separating this API out will encourage ignorants to ignore it. Why bother creating accessor objects when you can use a String literal?

    Re primitives: I have two getters; one throws an MissingKeyException, the other returns the supplied default value.

    I also do this for reference types (like String). Returning null when a value is missing is convenient until someone refactors the code and suddenly, null values creep out of the local scope.

  9. Libor Jelínek says:

    Great RAP side-effect :-) Hungry to incorporate it in my next project! Thanks Ralp!

  10. parttimenerd says:

    Cool library. I like your approach of keeping it simple because I need a simple JSON library for a pet project of mine that I’m able to modify easily, as I’m going to use different Java classes for integers and strings.

  11. Simon Mayerhofer says:

    Hey,

    looks nice and I’m going to use it for my actual project.
    But I really miss the ‘change’ method. because I have to change the value of an object which have to be on the same position like before. and not be put at the end of the array.
    So I would really love this library when you can add this Method.
    I still try to solve my problem on another way.

    regards

  12. Ralf Sternberg says:

    Thanks for your feedback!

    @Simon I hesitated to provide put() methods next to add(), because users may be tempted to always use put() instead of add() when constructing JSON objects. This practice would result in a degraded performance. Moreover, the existence of two methods to add a member to an object might be confusing.

    However, I agree that there should be a method to to modify a value in an object without modifying the order of the members. Would you mind opening a github issue?

    Regards, Ralf

12 responses so far

Written by . Published in Categories: EclipseSource News, Editors choice, Planet Eclipse

Author:
Published:
Apr 18th, 2013
Follow:

Twitter Google+ GitHub