wissel.net

Usability - Productivity - Business - The web - Singapore & Twins

From XML to JSON and back


In the beginning there was csv and the world of application neutral (almost) human readable data formats was good. Then unrest grew and the demand for more structure and contextual information grew. This gave birth to SGML (1986), adopted only by a few initiated.
Only more than a decade later (1998) SGML's offspring XML took centre stage. With broad support for schemas, transformation and tooling the de facto standard for application neutral (almost) human readable data formats was established - and the world was good.
But bracket phobia and a heavy toll, especially on browsers, for processing led to to rise of JSON (starting ca. 2002), which rapidly became the standard for browser-server communication. It it native to JavaScript, very light compared to XML and fast to process. However it lacks (as time of writing) support for an approved schema and transformation language (like XSLT).
This leads to the common scenario that you need both: XML for server to server communication and JSON for in-process and browser to server communication. While they look very similar, XML and JSON are different enough to make transition difficult. XML knows elements and attributes, while JSON only knows key/value pairs. A JSON snippet like this:
{ "name" : "Peter Pan",
  "passport" : "none",
  "girlfriend" : "Tinkerbell",
  "followers" : [{"name" : "Frank"},{"name" : "Paul"}]
}

can be expressed in XML in various ways:
<?xml version="1.0" encoding="UTF-8"?>
<person name="Peter Pan" passport="none">
    <girfriend name="Tinkerbell" />
    <followers>
        <person name="Frank" />
        <person name="Paul" />
    </followers>
</person>

.or.
<?xml version="1.0" encoding="UTF-8"?>
<person>
    <name>Peter Pan</name>
    <passport>none</passport>
    <girfriend>
        <person>
            <name>Tinkerbell</name>
        </person>
    </girfriend>
    <followers>
        <person>
            <name>Frank</name>
        </person>
        <person>
            <name>Paul</name>
        </person>
    </followers>
</person>

.or.
<?xml version="1.0" encoding="UTF-8"?>
<person name="Peter Pan" passport="none">
    <person name="Tinkerbell" role="girfriend" />
    <person name="Frank" role="follower" />
    <person name="Paul" role="follower" />
</person>

(and many others) The JSON object doesn't need a "root name", the XML does. The other way around is easier: each attribute simply becomes a key/value pair. Some XML purist see attributes as evil, I think they do have their place to make relations clearer (is-a vs. has-a) and XML less verbose. So transforming back and forth between XML and JSON needs a "neutral" format. In my XML Session at Entwicklercamp 2014 I demoed how to use a Java class as this neutral format. With the help of the right libraries, that's flexible and efficient.Using my favorite fruit class as an example, the solution is easy to comprehend:
@XmlRootElement(name = "FruitBasket")
public class FruitBasket {

	public static Fruit fromXML(InputStream in) throws Exception {
		JAXBContext context = JAXBContext.newInstance(FruitBasket.class);
		Unmarshaller um = context.createUnmarshaller();
		return (Fruit) um.unmarshal(in);
	}

	@XmlElement(name = "Fruit")
	private final Collection<Fruit>	allFruits	= new ArrayList<Fruit>();

	public boolean add(Fruit fruit) {
		return this.allFruits.add(fruit);
	}

	public boolean addAll(Collection<Fruit> allFruits) {
		return this.allFruits.addAll(allFruits);
	}

	public void saveXML(OutputStream out) throws Exception {
		JAXBContext context = JAXBContext.newInstance(FruitBasket.class);
		Marshaller m = context.createMarshaller();
		m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
		m.marshal(this, out);
	}

	public void saveJSON(OutputStream out) throws IOException {
		GsonBuilder gb = new GsonBuilder();
		gb.setPrettyPrinting();
		gb.disableHtmlEscaping();
		Gson gson = gb.create();
		PrintWriter writer = new PrintWriter(out);
		gson.toJson(this, writer);
		writer.flush();
		writer.close();
	}

	@Override
	public String toString() {
		ByteArrayOutputStream out = new ByteArrayOutputStream();
		try {
			this.saveJSON(out);
			//this.saveXML(out);
		} catch (IOException e) {
			return super.toString();
		}
		return out.toString();
	}
}

@XmlRootElement(name = "Fruit")
public class Fruit {

	public static Fruit fromJson(InputStream in) {
		Gson gson = new GsonBuilder().create();
		Fruit result = gson.fromJson(new InputStreamReader(in), Fruit.class);
		return result;
	}

	public static Fruit fromXML(InputStream in) throws Exception {
		JAXBContext context = JAXBContext.newInstance(Fruit.class);
		Unmarshaller um = context.createUnmarshaller();
		return (Fruit) um.unmarshal(in);
	}

	private String	name;
	private String	color;
	private String	taste;

	public Fruit() {
		// Default constructor
	}

	public Fruit(final String name, final String color, final String taste) {
		this.name = name;
		this.color = color;
		this.taste = taste;
	}

	public final String getColor() {
		return this.color;
	}

	public final String getName() {
		return this.name;
	}

	public final String getTaste() {
		return this.taste;
	}

	public void saveJSON(OutputStream out) throws IOException {
		GsonBuilder gb = new GsonBuilder();
		gb.setPrettyPrinting();
		gb.disableHtmlEscaping();
		Gson gson = gb.create();
		PrintWriter writer = new PrintWriter(out);
		gson.toJson(this, writer);
		writer.flush();
		writer.close();
	}

	public void saveXML(OutputStream out) throws Exception {
		JAXBContext context = JAXBContext.newInstance(Fruit.class);
		Marshaller m = context.createMarshaller();
		m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
		m.marshal(this, out);
	}

	public final void setColor(String color) {
		this.color = color;
	}

	public final void setName(String name) {
		this.name = name;
	}

	public final void setTaste(String taste) {
		this.taste = taste;
	}

	public String toJSON() throws IOException {
		ByteArrayOutputStream out = new ByteArrayOutputStream();
		this.saveJSON(out);
		return out.toString();
	}

	public String toXML() throws Exception {
		ByteArrayOutputStream out = new ByteArrayOutputStream();
		this.saveXML(out);
		return out.toString();
	}
}

The trick here is to use JAXB and Google GSON to make your life easier. Lets have a closer look:
Under the hood both GSON and JAXB use reflection to turn a class into JSON/XML and back. Annotations are added in front of the class or method and need to be defined in a package like any other Java source construct. By default GSON doesn't use any annotations, but in the package com.google.gson.annotations. There are just 4 of them:
  • @SerializedName("someName"): Sets the name to be used when creating the JSON, the defaults are set in the com.google.gson.FieldNamingPolicy enumeration
  • @Since(1.0) and @Until(2.1): Define the validity of the JSON element. To activate that you need to set GsonBuiler.setVersion(double), otherwise all items are processes
  • @Expose or @Expose (serialize = false, deserialize = false): You need to set GsonBuilder().excludeFieldsWithoutExposeAnnotation() to activate. Once done only fields with a Expose annotation will show up unless the annotation has the serialize/deserialize set to false
Notably absent is a possibility to set a "root name". This isn't surprising, since JSON usually just starts/ends with { } and doesn't use a root name.
The annotations in JAXB are much more complex, reflecting the extended possibilities of XML structure using Elements and Attributes. The annotations can be found in the javax.xml.bind.annotation name space and are part of core Java. Some of the interesting annotations:
  • @XmlRootElement(name = "FruitBasket"): Defines the root name of your XML, used as class name annotation
  • @XmlElement(name = "Fruit"): Sets the name of a variable to be serialized as XML Element. Can contain a namespace="someuri" definition
  • @XmlAttribute(name="id"): defines a variable to be saved as XML Attribute
  • @XmlTransient: Ignore this variable (as variables defined as transient are also ignored)
  • @XmlAccessorType(XmlAccessType): Defines what parts of the class get serialized. By default all fields and properties. This can lead to duplication since String bla and String getBla() would result in the same XML Element. The options are
    • XmlAccessType.FIELD: Only fields, but no properties (getSomething()) are serialized
    • XmlAccessType.NONE: Only properties and fields that have explicit annotations get serialized
    • XmlAccessType.PROPERTY: Properties are serialized. I got errors if I had fields with the same name as properties, so I'm not sure how useful that is
    • XmlAccessType.PUBLIC_MEMBER: Only fields and properties that are publicly visible get serialized, so no inner workings of the class gets exposed
    While it is a little more work, I found that NONE is the save option, you retain full control on the serialization process. I also got fond of transient - it documents that you deliberately excluded that variable
  • @XmlSchema: Defines the schema and namespace for a package to be used in a class. Goes into the package-info.java file (an unsung hero of the Java package system).
The real big difference between JSON and XML are the namespaces, which have not arrived in JSON land yet. While they are difficult to grasp, they are extremly powerful to qualify what you are looking at. Think about "bank". Is it Financial_Institution:bank, Edge_between_river_and_land:bank or Aircraft_turning_instruction:bank? Qualification is important, especially when talking between independent systems (a browser usually gets its JSON from one place, so ambiguity can be sorted out by the sender). There is lots of information around for a decade, as well as very current one . The Oracle tutorials make a good read, as does the blog of Blaise Doughan. Take an excellent tutorial by Lars Vogel. Finally have a look at the official JavaDocs.
My take: JAXB allows fine-tuning, complexity, covering any thinkable use case. I would keep that in the back of my mind, locked away and keep it simple. The most likely case that you might need this fancy features is when you are given an existing schema you can't tweak for simplicity.

As usual: YMMV

Posted by on 12 July 2014 | Comments (1) | categories: Software

Comments

  1. posted by Ursus on Saturday 12 July 2014 AD:
    Hi Stephan

    there seems to be a little problem with your blog entry (do we even call them that anymore?) - After "The trick here is to use JAXB and Google GSON to make your life easier. Lets have a closer look:" there is nothing :o(

    Thanks for the interesting entry... now if we could just have the juicy bit Emoticon laugh.gif

    Greetings from Austria
    Ursus