Wordpress WXR is one of the best features it has. But there are some issues with it. At the moment (using Wordpress 2.1.2) it’s not really valid XML. If you’re picky, it’s not even well-formed …
A WXR document directly starts with the <rss> tag, where it should start with <xml>. But that’s not the only problem. The famous ยป is often used in Wordpress posts, but it’s not declared anywhere. So most XML parsers have a hard time parsing this. Don’t let me get started on comments, they are not escaped. So any links will hinder your parsing.
That said, it’s a really cool thing. I was able to consolidate two blogs into a new one. The next step was to export it all to static HTML. I had to manually edit the export file because of the above mentioned reasons. Wordpress WXR will be a powerful tool for blogs, enhancing blogging everywhere.
Tags: valid, wordpress, wxr, xml
November 11th, 2007 at 4:21 pm
[…] So, there’s my really old blog about Australia (in Dutch), my old blog at jroller, and my self-hosted blog, and now there’s this one. I’ve tried to see if it was feasible to merge all that old stuff into this new one, but it’s really too much of a hassle (wordpress doesn’t really do XML very well). […]