Typo3: The Split Site Root Syndrome

Submitted by Hannes Schmidt on Wed, 01/05/2005 - 16:32.
The Split Site Root Syndrome

[continued from Typo3: Including the Home Page (Site Root) in Navigation]

The Problem

For months, my client's site was running fine using the above simple solution. Over time though, I noticed several oddities. These oddities were caused by the fact that the home page is accessible under two URLs, the canonical root http://www.domain.com/ and the non-canonical root http://www.domain.com/home.0.html as shown in the figure below.

As you can see, links two the home page can be distinguished in two groups. External links point to the canonical root. All internal links point to the non-canonical root.

The oddities that I noticed were

  • Sometimes Google's index only included the canonical root, sometimes only the non-canonical root and sometimes both. The site is out of the sandbox, but rankings still are artificially low (on second page) although everything indicates that it should be ranking in the top ten.
  • My favourite PageRank tool for Mozilla Firefox, the Firefox extension Search Status, produced strange results for the home page. Most of the times, both URLs are shown to have PR0. Sometimes only one had PR but I can't remember which one. The Google Toolbar in IE shows the same PR5 for both URLs.
  • On MSN Beta Search, the site jumps on and off the SERPS wildly.

At that point I started getting a little paranoid. What if I'm suffering from Google's infamous duplicate content penalty? To the Google spider, I have two identical versions of my home page which falls under the definition of duplicate content. It's just like the / vs. /index.hml problem that Google fixed some time ago. The Google spider is now clever enough to see that / and /index.html are two synonymous URL paths for the same page but it could be that it's not clever enough to figure that / and /home.0.html are synonymous. Anyway, I decided to get rid of this split-URL problem on my client's site.

Note that this split root page syndrome is not only

The solution

As far as I know, there is no out of the box Typo3 solution so I developed my own. It employs a PHP IProcFunc that scans every generated menu item for links to the non-canonical root and changes their href attribute to point to the canonical root.

This is the necessary TypoScript code in the root template:

includeLibs.tweakMenu = fileadmin/ts/tweakMenu.php
 .
   t.menu = HMENU
   t.menu.1 = TMENU
   t.menu.1 {
     IProcFunc = user_tweakMenu->replace
     IProcFunc.searchFor = home.0.html
     IProcFunc.replaceWith = /
 .
   }

This is the content of fileadmin/ts/tweakMenu.php:

<?php
   class user_tweakMenu {
     function replace( $I, $conf ) {
         $I[ 'parts' ][ 'ATag_begin' ] = str_replace( 
             $conf['searchFor'], 
             $conf['replaceWith'] ? $conf['replaceWith'] : "",
             $I[ 'parts' ][ 'ATag_begin' ] );
         return $I;
     }
   }
?>

That's pretty much it. All home links on my client's site now point to the canonical site root. I had a home-grown language menu that needed the same kind of fix and a few hard-coded links to the home page that I fixed manually. Now we shall see what the Google ranking does. Stay tuned.

( categories: Typo3 | Webmaster )
Submitted by Anonymous on Mon, 09/01/2008 - 02:50.
Doesn't seem to work with cooluri :( IProcFunc = user_tweakMenu->replace IProcFunc.searchFor = home/ IProcFunc.replaceWith = /
Submitted by Anonymous on Wed, 03/29/2006 - 06:57.

[globalVar = TSFE:id={$PID_HOME}]
temp.menu.alwaysActivePIDlist = {$PID_HOME_SHORTCUT}
[global]

where PID_HOME is the UID of the root level page
and PID_HOME_SHORTCUT the UID of the shortcut pointing to this page.

With RealURL you'll always get the link to /
while the shortcut page is still highlighted if you are on the homepage

(btw: this verification crap is nearly unreadable in 90% of the images)

Submitted by Hannes Schmidt on Tue, 01/18/2005 - 08:30.

Ranking has improved. Site is back in top 10 for Google and back to #2 in Beta MSN Search. Firefox's Search Status extension still reports PR0 (white bar) for root page.

Submitted by Hannes Schmidt on Wed, 01/12/2005 - 11:21.

A couple of weeks ago, I put a <base> tag in my header for some reason. Silly me forgot to take it out again. Due to this <base> tag all section indexes stopped working. Today when I realized this, my heart turned into a cold stone: my client's site has had a major navigation problem for weeks!

Anyway, it's working now. But the <base> tag had side effects on the the above split site root workaround. Furthermore, there was a bug in the PHP code. All fixed now.