This plugin will create a sitemap for your WordPress site (http://example.com/sitemap.xml
). This was originally created for a multi-site setup, but it works fine on a single site install as well. There is no ‘User Interface’ to speak of with this plugin, just drop in place.
You may also download this code via it’s github repository:
- Version: 1.0
First, create a file named sitemap-generator.php
in your mu-plugins
directory, (or your plugins
directory if your using a single install WordPress setup) and past the below into it.
<?php /* Plugin Name: Sitemap Generator Plugin URI: http://technology.mattrude.com/2011/10/07/wordpress-sitemap-generator-plugin/ Description: Automatic generate standard XML sitemap (http://example.com/sitemap.xml) that supports the protocol including Google, Yahoo, MSN, Ask.com, and others. No files stored on your disk, the sitemap.xml file is generate as needed, like your feeds. Version: 1.0 Author: Matt Rude Author URI: http://mattrude.com/ */ function sitemap_flush_rules() { global $wp_rewrite; $wp_rewrite->flush_rules(); } add_action('init', 'sitemap_flush_rules'); function xml_feed_rewrite($wp_rewrite) { $feed_rules = array( '.*sitemap.xml$' => 'index.php?feed=sitemap' ); $wp_rewrite->rules = $feed_rules + $wp_rewrite->rules; } add_filter('generate_rewrite_rules', 'xml_feed_rewrite'); function do_feed_sitemap() { $template_dir = dirname(__FILE__) . '/templates'; load_template( $template_dir . '/feed-sitemap.php' ); } add_action('do_feed_sitemap', 'do_feed_sitemap', 10, 1); ?>
Next, create a new directory named templates
and past the below in a file named feed-sitemap.php
in it.
<?php /** * XML Sitemap Feed Template for displaying XML Sitemap Posts feed. */ //header('Content-Type: text/xml; charset=' . get_option('blog_charset'), true); echo '<?xml version="1.0" encoding="'.get_option('blog_charset').'"?'.'>'; ?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"> <!-- Main Site --> <url> <loc><?php bloginfo_rss('url') ?></loc> <lastmod><?php echo mysql2date('Y-m-d\TH:i:s\Z', get_lastpostmodified('GMT'), false); ?></lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> <!-- Site Pages --> <?php $args = array( 'post_type' => 'page', 'numberposts' => 100, 'status' => 'publish', 'orderby' => 'date', 'order' => 'DESC' ); $post_ids = get_posts($args); if ($post_ids) { foreach ($post_ids as $post) { ?> <url> <loc><?php the_permalink_rss() ?></loc> <lastmod><?php echo mysql2date('Y-m-d\TH:i:s\Z', get_post_time('Y-m-d H:i:s', true), false); ?></lastmod> <changefreq>monthly</changefreq> <priority>0.7</priority> </url> <?php } } ?> <!-- Site Posts --> <?php $args = array( 'post_type' => 'post', 'numberposts' => 100, 'post_status' => 'publish', 'orderby' => 'date', 'order' => 'DESC' ); $post_ids = get_posts($args); if ($post_ids) { foreach ($post_ids as $post) { ?> <url> <loc><?php the_permalink_rss() ?></loc> <lastmod><?php echo mysql2date('Y-m-d\TH:i:s\Z', get_post_time('Y-m-d H:i:s', true), false); ?></lastmod> <changefreq>monthly</changefreq> <priority>0.5</priority> <?php $args2 = array( 'post_type' => 'attachment', 'numberposts' => 200, 'post_parent' => $post->ID, 'post_mime_type' => 'image', 'orderby' => 'date', 'order' => 'DESC' ); $images = get_posts($args2); if ($images) { foreach ($images as $post) { ?> <image:image> <image:loc><?php echo wp_get_attachment_url(); ?></image:loc> <?php if ( !empty($post->post_excerpt) ) echo ' <image:caption>' . esc_html($post->post_excerpt, 1) . '</image:caption> '; ?> <image:title><?php echo esc_html($post->post_title, 1) ?></image:title> </image:image> <?php } } ?> </url> <?php } } ?> </urlset>
A nice and simple approach, but 3 questions:
1. Why is this there: remove_filter(‘pre_get_posts’,’category_excluder_exclude_categories’);
2. Why do you use the_permalink_rss()? It seems not documented, why not get_permalink() ?
3. Is it possible to avoid the 301 redirect to site.com/sitemap.xml/ – note the trailing slash…
And a small hint:
It maybe should be / with trailing slash.
Thanks for looking over the code.
1. This code was taken from a live site, and that filter was left in by mistake, it has since been remove.
2. The sitemap.xml is basically a rss feed, just in a sitemap format. I really don’t remember what the difference is, but I believe it allows you to display the URL without the post ID.
3. As far as I know, no, it’s not possible to avoid this redirect – will maybe with come clever Apache redirecting, but I have no clue how you would.
Thanks
-Matt
Thanks for quick reply.
2. Found ‘the_permalink_rss()’ in the meantime in ‘wp-includes/feed.php’, it uses get_permalink() internally.
3. Add this to file ‘sitemap-generator.php’:
4. The small hint was about the bloginfo_rss(‘url’) line, tags have been stripped in comment…
Another small suggestion, maybe better use get_post_modified_time() instead of get_post_time()
During research about a http error 404 for sitemap xml feed problem, I found an interesting reason which I would like to share:
WordPress adds a 404 header to all feeds if no posts exist in a site.
More details and a hotfix here:
http://wordpress.org/support/topic/sitemap-xml-feed-is-shown-but-404-header-added-by-wordpress-if-site-has-no-posts