Alexander The Great

December 31, 2007

XML Purification

Filed under: Science — alexanderthegreatest @ 7:19 pm
Tags: , , ,

This will be of limited interest to most readers, however it’s still worth covering, in purely academic terms. Brian Keever has written a post on XML Purification, or turning non-compliant into well-formed XML.

Extensible Markup Language is the meta language of the computing gods.  While Oracle and SQL Server have proprietary binary formats which are clearly superior in terms of raw speed, as well as in some instances file size, they’re more than a nightmare for compatibility.  No surprise that different technology has a different set of pros and cons, in fact of design goals.

The law of the land, as delivered to us by Microsoft, is that imperfect XML is unreadable.  Not even a single byte of data may be taken from an XML document with even the slightest flaw. The reasoning makes enough sense, “software cannot be responsible for guessing at a developer’s intentions.” Part of the hype of XML is universal compatibility, though – in fact, XML uses Unicode (UTF-16) to allow for internationalization. Such invalid characters as an accented vowel can destroy the ability to read a document, though.

Thus, Mr Keever has delivered unto us a way to fix such XML that has valid markup but illegal characters.



  1. Sounds like ethnic cleansing.

    Comment by vaticanism — December 31, 2007 @ 9:12 pm | Reply

  2. Huh? Are you smoking crack? How do you figure genocide and XML document processing are related in any way?

    Comment by alexanderthegreatest — December 31, 2007 @ 11:11 pm | Reply

  3. […] Get the entire post from here. […]

    Pingback by Blogs and RSS » XML Purification Alexander The Great — March 10, 2008 @ 3:37 pm | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at

%d bloggers like this: