Wordpress WXR and Valid XML

Wordpress WXR is one of the best features it has. But there are some issues with it. At the moment (using Wordpress 2.1.2) it’s not really valid XML. If you’re picky, it’s not even well-formed …

A WXR document directly starts with the <rss> tag, where it should start with <xml>. But that’s not the only problem. The famous ยป is often used in Wordpress posts, but it’s not declared anywhere. So most XML parsers have a hard time parsing this. Don’t let me get started on comments, they are not escaped. So any links will hinder your parsing.

That said, it’s a really cool thing. I was able to consolidate two blogs into a new one. The next step was to export it all to static HTML. I had to manually edit the export file because of the above mentioned reasons. Wordpress WXR will be a powerful tool for blogs, enhancing blogging everywhere.

Tags: , , ,

2 Responses to “Wordpress WXR and Valid XML”

Leave a Reply


uberdose 2.0

L-l-look at you, hacker.