WordPress XML Export Converter: New Version 1.03

While preparing this blog’s Table of Contents page I noticed that my WordPress tool wpxslgui would not process WP’s Export XML as it did before.

I noticed that the export format had changed from

xmlns:wp="http://wordpress.org/export/1.1/"

to

xmlns:wp="http://wordpress.org/export/1.2/"

Well, if this introduction sounds familiar, you are right.

Unfortunately, this time the change of the wp namespace meant that the wp:category has been dropped as a defined element.

For the Table of Contents xslt, this change means that the categories have to be extracted from the posts’ (<item> elements) <category> child elements. (The other two Xslt file used for generating a Single HTML and a Word HTML file are not affected by this change)

Since the categories occur several times throughout the blog’s XML file, they need to be collected and sorted before outputting them in the result file using the so-called Muenchian method.

To collect the categories, the <xsl:key> element is used:

<xsl:key name="categories" match="/rss/channel/item/category[@domain='category']" use="@nicename" />

The categories are selected and sorted using <xsl:applytemplates>

<xsl:apply-templates 
    select="item[wp:post_type = 'post' and wp:status = 'publish']/category" 
    mode="foo" >
  <xsl:sort order="ascending" select="text()"/>
</xsl:apply-templates>

Only the first of each set of category duplicates is output:

<xsl:template 
    match="item[wp:post_type = 'post' and wp:status = 'publish']/category
       [ generate-id() = generate-id(key('categories', @nicename)[1]) ]" 
    mode="foo">
  <xsl:value-of select="text()"/>
  <xsl:text>
</xsl:text>
</xsl:template>

To avoid outputting the categories’ text() property repeatedly, we need to prevent evaluating the inner text:

  <xsl:template match="text()" mode="foo"></xsl:template>

The main features of wpxslgui remained the same:

  • Convert WordPress XML to HTML Table of Contents with links to the original blog
  • Convert WordPress XML to a single HTML file allowing filter by category (JavaScript)
  • Convert WordPress XML to Word HTML document (can be saved as .doc or .docx in Word)

After downloading the latest version of wpxslgui, export your WordPress blog to XML (select “All content”), and convert the file into any of the supported output formats.

1 thought on “WordPress XML Export Converter: New Version 1.03

  1. Pingback: On Converting Data | devioblog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.