During the development of TYPO3 9LTS the SEO initiative streamlined and consolidated all functionality related to Search Engine Optimization (SEO) in a new system extension seo. Under the competent lead of Richard Haeser (Twitter) new features were introduced to cover the most important SEO tasks. These features include a new page title API, a metatag API, support for social media metatags in the page properties as well as an XML sitemap generation. Although most of the (bigger) TYPO3 projects out there had those things in place for many years, 3rd party extensions were needed. That is why there is such a wide variety of SEO extensions. To only name a few: seo_basics, metaseo, cs_seo and yoast_seo. There are also extensions that only provide an XML sitemap like just_sitemap, inm_googlesitemap, dd_googlesitemap, sitemap_generator and many more.

Of course you can still use an extension for all the SEO stuff but with 9LTS the TYPO3 core finally comes with a tool set sufficient for most use cases. And since Richard already wrote a very in depth article about the page title API, we will have a look at the sitemap in this article. By the way: the sitemap can be seen in action on this very blog at usetypo3.com/sitmap.xml. Since the implementation is very straight forward this will be a rather short post. Let's dive into it anyway!
 

Concept and Implementation

The approach chosen by the SEO initiative for implementing a sitemap is the recommended way of handling large sitemap: by splitting them into smaller chunks and provide an index sitemap with the links to the smaller sitemaps (See recommendation "Split up your large sitemaps" by Google). Of course TYPO3s new default sitemap can be configured and extended but the general idea is that it provides one sitemap per record type. For example, you could have one sitemap for pages, another sitemap for all detail pages of products and a third one for news records. All of which are listed in your index sitemap. This approach keeps the sitemaps clear and simple as well as easier to "debug".

The implementation in TYPO3 (Patch, Feature RST) relies on a configuration in TypoScript. The sitemap is rendered through a dedicated page type (1533906435). Let's have a look at the page type definition:

seo_sitemap = PAGE
seo_sitemap {
  typeNum = 1533906435

  config {
    cache_period = 900
    disableAllHeaderCode = 1
    admPanel = 0
    removeDefaultJS = 1
    removeDefaultCss = 1
    removePageCss = 1
    additionalHeaders.10 {
      header = Content-Type:application/xml;charset=utf-8
    }
  }

  10 = USER
  10.userFunc = TYPO3\CMS\Seo\XmlSitemap\XmlSitemapRenderer->render
}

The XmlSitemapRenderer collects all configured sitemaps and renders either a list or a complete single sitemap. So next we need to look at the configuration of the sitemaps. The actual sitemap configuration is processed in DataProviders and then rendered in fluid.

Let's have a look at the default configuration for a sitemap of pages first, because this is something, that every TYPO3 installation needs.

Top

Sitemaps for Pages

If you install EXT:seo (depending on how you set up TYPO3 you might have to add the extension first, because it is not part of the typo3_minimal composer meta package) and include the provided TypoScript the configuration of your sitemap looks like this:

plugin.tx_seo {
  config {
    xmlSitemap {
      sitemaps {
        pages {
          provider = TYPO3\CMS\Seo\XmlSitemap\PagesXmlSitemapDataProvider
          config {
            excludedDoktypes = 3, 4, 6, 7, 199, 254, 255
            additionalWhere = no_index = 0
          }
        }
      }
    }
  }
}

We see that the used provider is specified (TYPO3\CMS\Seo\XmlSitemap\PagesXmlSitemapDataProvider) as well as some restrictions of what to render in this sitemap. You should immediately get the idea that this is very flexible and customizable. You could use your own Provider, you could use other restrictions and additionally you can override the paths for the fluid templates if you want to change the rendering. By default, pages are only included in the sitemap if they are not excluded in the page Properties. Every page is included by default, as can be seen in this screenshot:

If you need to further restrict the pages included in a single sitemap, add to the config or use your own DataProvider. Also note that the key of the sitemap pages is just a string. You could name this whatever you want. The configuration supports the following options:

  • excludedDoktypes: pages with these Doktypes are not included in the sitemap. For example a sysfolder has the Doktype 254. Hint: The default Doktypes of TYPO3 are defined as PHP constants in the PageRepository.
  • additionalWhere: This statement will be added to the database query to find the included pages.
  • rootPage: Only pages ion the rootline of this page are included in the sitemap. This comes in handy if you want to split a huge page tree into several sitemaps.
  • template: The name of the fluid template to render for this sitemap. By default, it is Sitemap.

This pretty much covers everything about a sitemap of pages. Let's move on to other records that need sitemaps, e.g. news records with detail pages.

Top

Sitemaps for other Records

Apart from pages TYPO3 websites often have other types of record that have dedicated detail pages with static URLs. Examples for this are products, news or events that are created by (extbase) extensions and are no pages records on the database level. The sitemap of EXT:seo can handle those records as well. It ships a RecordsXmlSitemapDataProvider that takes a more advanced configuration that the pages sitemap from the first example.

Let's take an example from an RST that was added after the 9LTS release because support for recursive record collection multiple levels down the pagetree was added as a missing feature  in TYPO3 9.5.2 (Patch, Feature RST):

plugin.tx_seo.config {
  xmlSitemap {
    sitemaps {
      news {
        provider = TYPO3\CMS\Seo\XmlSitemap\RecordsXmlSitemapDataProvider
        config {
          table = tx_news_domain_model_news
          sortField = sorting
          lastModifiedField = tstamp
          pid = 26
          recursive = 2
          url {
            pageId = 25
            fieldToParameterMap {
              uid = tx_news_pi1[news]
            }
            additionalGetParameters {
              tx_news_pi1.controller = News
              tx_news_pi1.action = detail
            }
            useCacheHash = 1
          }
        }
      }
    }
  }
}

Let's have a look at everything we can configure for such a sitemap:

  • table: Name of the database table the records are stored in.
  • sortField: Name of the field for the ORDER BY. Defaults to sorting.
  • lastModifiedField: Name of the field for the information about the last modification of the record. Defaults to tstamp.
  • pid: page ID where the records are stored. Also the starting point for any recursive collection. Can be a comma separated list of IDs.
  • recursive: Amount of levels down the page tree beginning with pid.
  • additionalWhere: This statement will be added to the database query to find the included records.
  • template: The name of the fluid template to render for this sitemap. by default it is Sitemap.
  • url: The following sub configuration determines the link that will be generated to each record .
    • pageId: The page ID where the links should point to. Usually a page with some kind of detail view plugin.
    • fieldToParameterMap: If fields of the database row need to be added as GET parameter to the URL we can define a mapping here. Usually the uid of a record is passed along.
    • additionalGetParameters: Other GET parameters that should be added to the URL. E.g. the controller and action for an extbase plugin.
    • useCacheHash: Whether the cHash GET parameter should be calculated and added to the URL. Defaults to 0.

Now that we know everything about the configuration of the default DataProviders, let's see how we map the sitemap page type to /sitemap.xml of our project.

Top

Map Page Type to /sitemap.xml with a routeEnhancer

So far we can reach our sitemap by the defines page type 1533906435. So a last step is to map this page type to /sitemap.xml. If you do not use the site configuration of TYPO3 9 LTS yet, you are most likely still using realurl and know how to do this.

However, since the revolutionary site handling in TYPO3 9 realurl is no longer required and should be avoided! So if you already have your site config at hand, head over to the YAML file and add a routeEnhancer:

routeEnhancers:
  PageTypeSuffix:
    type: PageType
    map:
      sitemap.xml: 1533906435

If you add these lines to your site config you should be able to access your sitemap through sitemap.xml. However, the routing and route enhancers are a topic for another blog post and another day. If you want to start learning about it, head over to the following Feature RSTs:

With this information you should be able to use the XML sitemap in your TYPO3 projects. If you feel like thanking someone for this goodie, head over to your preferred social media channel and ping Richard and the rest of the SEO initiative.

Thanks for reading and happy sitemapping!

Top

Further Information