Usability - Productivity - Business - The web - Singapore & Twins

Oracle broke the Java forums #fail (and how to use SAX to create XML Documents in Java)

Oracle seems to be over zealous to remove SUN from the face of the IT landscape. SUN used to have a very comprehensive Java forum with tons of Java related knowledge. Now all links, regardless how deep, to the SUN forum are redirected to the Oracle forum Homepage. Yes all of them. So every cross reference linking to forum entries broke. I once contributed a code snipped how to create XML documents using SAX (since most people think SAX is a read-only API, which is not the case) and that link now points to the homepage. Must be some vendetta against a certain ex SUN employee who stated in 1998 " Any URL that has ever been exposed to the Internet should live forever" and even has the W3C on his side. On the other hand:" Never ascribe to malice that which is adequately explained by incompetence". I digged around in the Oracle forum and manage to locate my post, but obviously it was to hard for the database champion to maintain authorship, so the entry is now attributed to: SunForumsGuest. No wonder a lot of people are, let's say "not fully happy" with Oracle.

Anyway lesson learned: contributions I make somewhere need mirroring here, so here we go:

A common mis-perception about SAX: " SAX is a parser, not a generator." As a fact of the matter SAX does just fine generating your XML document, especially when it gets rather large. I've seen countless implementations of String based construction of XML that all at some point in time break since there is one extra " in an attribute or a new line or a double byte character etc. Using SAX all of these issues are taken care of. Your responsibility is to get the tag nesting right, the rest handled by SAX including processing instructions, text content and attributes. Here's a piece of sample code (nota bene: it has a stylesheet instruction, so when you open the resulting file in a browser you get an error since the sheet won't be there):
PrintWriter pw = new PrintWriter (out ) ; //out comes from outside and is an OutputStream
StreamResult streamResult = new StreamResult (pw ) ;
// Factory pattern at work
SAXTransformerFactory tf = ( SAXTransformerFactory ) TransformerFactory. newInstance ( ) ;
// SAX2.0 ContentHandler that provides the append point and access to serializing options
TransformerHandler hd = tf. newTransformerHandler ( ) ;
Transformer serializer = hd. getTransformer ( ) ;
serializer. setOutputProperty ( OutputKeys. ENCODING, "UTF-8" ) ; // Suitable for all languages
serializer. setOutputProperty ( OutputKeys. DOCTYPE_SYSTEM, "myschema.xsd" ) ; //Replace this with something usefull
serializer. setOutputProperty ( OutputKeys. DOCTYPE_SYSTEM, "http://schema.notessensei.com/myschema/1.0" ) ;
serializer. setOutputProperty ( OutputKeys. METHOD, "xml" ) ;
serializer. setOutputProperty ( OutputKeys. INDENT, "yes" ) ; // So it looks pretty in VI
hd. setResult (streamResult ) ;
// This creates the empty document
hd. startDocument ( ) ;

//Get a processing instruction
hd. processingInstruction ( "xml-stylesheet", "type=\"text/xsl\" href=\"mystyle.xsl\"" ) ; // That file needs to exist, or comment out this line

//This creates attributes that go inside the element, all encoding is taken care of
AttributesImpl atts = new AttributesImpl ( ) ;
atts. addAttribute ( "", "", "someattribute", "CDATA", "test" ) ;
atts. addAttribute ( "", "", "moreattributes", "CDATA", "test2" ) ;

// This creates the element with the previously defined attributes
hd. startElement ( "", "", "MyTag", atts ) ;

// Now we write out some text, but it could be another tag too
// Make sure there can be only ONE root tag
String curTitle = "Something inside a tag" ;
hd. characters (curTitle. toCharArray ( ), 0, curTitle. length ( ) ) ;

// End the top element
hd. endElement ( "", "", "MyTag" ) ;

// Closing of the document,
hd. endDocument ( ) ;
The bonus tip from the original discussion: to keep track of your tag nesting you use a Stack. Whenever you open a element you push the closing tag onto a stack which you then can pop empty, so your nesting will at least be XML compliant.

Posted by on 10 November 2010 | Comments (0) | categories: Software


  1. No comments yet, be the first to comment