Sean O'Donnells Weblog
I've been working on an application that needs to convert old fashioned HTML to XHTML. Almost all of the libraries I have come across that do a good job are just wrappers around HTMLTidy, and very few of them can simply be installed with a simple apt get. I did not feel particularly good requiring them as a result. I hate to think of someone tearing their hair out just trying to get my stuff installed before they can use it.
Check out the following example
Share on Twitter Share on Facebook
>>> from twisted.web import microdom
>>> x = microdom.parseString("<div>hello<br>world</DIV>",beExtremelyLenient=1)
'<?xml version="1.0"?><div>hello<br />world</div>'