Typo3: The Split Site Root Syndrome
The Problem
For months, my client's site was running fine using the above
simple solution. Over time though, I noticed several oddities. These oddities
were caused by the fact that the home page is accessible under two URLs, the
canonical root http://www.domain.com/
and
the non-canonical root http://www.domain.com/home.0.html
as
shown in the figure below.
As you can see, links two the home page can be distinguished in two groups. External links point to the canonical root. All internal links point to the non-canonical root.
The oddities that I noticed were
- Sometimes Google's index only included the canonical root, sometimes only the non-canonical root and sometimes both. The site is out of the sandbox, but rankings still are artificially low (on second page) although everything indicates that it should be ranking in the top ten.
- My favourite PageRank tool for Mozilla Firefox, the Firefox extension Search Status, produced strange results for the home page. Most of the times, both URLs are shown to have PR0. Sometimes only one had PR but I can't remember which one. The Google Toolbar in IE shows the same PR5 for both URLs.
- On MSN Beta Search, the site jumps on and off the SERPS wildly.
At that point I started getting a little paranoid. What if
I'm suffering from Google's infamous duplicate content penalty? To the Google
spider, I have two identical versions of my home page which falls under the
definition of duplicate content. It's just like the /
vs. /index.hml
problem
that Google fixed some time ago. The Google spider is now clever enough to
see that /
and /index.html
are two synonymous URL
paths for the same page but it could be that it's not clever enough to figure
that /
and /home.0.html
are
synonymous. Anyway, I decided to get rid of this split-URL problem on my client's
site.
Note that this split root page syndrome is not only
The solution
As far as I know, there is no out of the box Typo3 solution so I developed my own. It employs a PHP IProcFunc that scans every generated menu item for links to the non-canonical root and changes their href attribute to point to the canonical root.
This is the necessary TypoScript code in the root template:
includeLibs.tweakMenu = fileadmin/ts/tweakMenu.php . t.menu = HMENU t.menu.1 = TMENU t.menu.1 { IProcFunc = user_tweakMenu->replace IProcFunc.searchFor = home.0.html IProcFunc.replaceWith = / . }
This is the content of fileadmin/ts/tweakMenu.php:
<?php class user_tweakMenu { function replace( $I, $conf ) { $I[ 'parts' ][ 'ATag_begin' ] = str_replace( $conf['searchFor'], $conf['replaceWith'] ? $conf['replaceWith'] : "", $I[ 'parts' ][ 'ATag_begin' ] ); return $I; } } ?>
That's pretty much it. All home links on my client's site now point to the canonical site root. I had a home-grown language menu that needed the same kind of fix and a few hard-coded links to the home page that I fixed manually. Now we shall see what the Google ranking does. Stay tuned.