Tag Cloud Explorer degrades Plone website in Google

Articles

Adding a portlet with tag cloud from Tag Cloud Explorer product can severely damage yours Plone site Google ranking. Read article to find out simple solution.

Introduction

About three weeks ago I’ve added Tag Cloud Explorer (version 1.1.0) to my website. The reason was to simplify navigation and expose some popular subjects directly on home page. And of course this is so “web-two-zero-ish”. But recently I’ve noticed disturbing changes in my Google ranking for various keywords.

Problem

I’m monitoring quite regularly my website position in Google using a little tool I’ve written (Python rules!). Also thanks to Webmaster Tools I can see indexed pages and notice when something new appear.

First I’ve noticed that the number of pages and data has dramatically increased. Then I’ve seen lots of new pages in Internal Links section. But they were not a new pages. They were basically links from Tag Cloud Explorer portlet generated by template portlet_tag_cloud_explorer.pt . And the search result page: tag_cloud_explorer_results.pt and listing of all tags: tag_cloud_explorer.pt made situation even worse. Those two pages contain listing of links to relevant documents. But also for each of them below we can find links to other assigned keywords. And selecting a keyword from there joins old keywords with the new one. This is a new unique link and that makes Google think: “O, I’ve found a new page!”. But of course this is not a page with new content, but only another search result. The number of possible combinations can be measured in hundreds or thousands, considering that each of my document was described by few keywords.

Unfortunately for me/us Google is pretty smart. It’s noticed hundreds of new pages but the content was not new. They just display what I already have on my site but in another way. Pages were also full of keywords, which may appear irrelevant to displayed content. And what Google engineers do when someone tries to mess up with their algorithms? Penalize the website.

At the very beginning I’ve notices a small increase of my rankings. But 3 days ago there was a major drop. I think that then Google used “new” pages and recalculate my position in ranking for large number of keywords. And because I was “cheating” I was punished with much, much lower position than before.

Solution

The solution is always simple. We need to prohibit Google, and hopefully other robots, to visit those links. What I did is I’ve customized mentioned templates: portlet_tag_cloud_explorer.pt , tag_cloud_explorer_results.pt , tag_cloud_explorer.pt . For each tag I’ve added attribute rel=”nofollow” . This attribute, is I think Google’s invention, that extends HTML specification. Basically link with such attribute won’t be visited by a robot and linking page will not be indexed.

I bet that Ingeniweb team will quickly fix this and release another version.

The size of the problem

Because Google already indexed all those pages I had to disable access to them by adding ‘Disallow: /tag_cloud_explorer_results’ to my robots.txt.

Today I’ve looked at Google Webmaster Tools. And it shows that there were 9141 links blocked this way. My whole website contains maybe 50 pages so this can show the scale of the problem. I can only imagine the punishment from Google to a website claiming to have 9000 pages and only 50 (0.5%) with unique content.

Tag Cloud Explorer degrades Plone website in Google

Adding a portlet with tag cloud from Tag Cloud Explorer product can severely damage yours Plone site Google ranking. Read article to find out simple solution.

Introduction

About three weeks ago I’ve added Tag Cloud Explorer (version 1.1.0) to my website. The reason was to simplify navigation and expose some popular subjects directly on home page. And of course this is so “web-two-zero-ish”. But recently I’ve noticed disturbing changes in my Google ranking for various keywords.

Problem

I’m monitoring quite regularly my website position in Google using a little tool I’ve written (Python rules!). Also thanks to Webmaster Tools I can see indexed pages and notice when something new appear.

First I’ve noticed that the number of pages and data has dramatically increased. Then I’ve seen lots of new pages in Internal Links section. But they were not a new pages. They were basically links from Tag Cloud Explorer portlet generated by template portlet_tag_cloud_explorer.pt . And the search result page: tag_cloud_explorer_results.pt and listing of all tags: tag_cloud_explorer.pt made situation even worse. Those two pages contain listing of links to relevant documents. But also for each of them below we can find links to other assigned keywords. And selecting a keyword from there joins old keywords with the new one. This is a new unique link and that makes Google think: “O, I’ve found a new page!”. But of course this is not a page with new content, but only another search result. The number of possible combinations can be measured in hundreds or thousands, considering that each of my document was described by few keywords.

Unfortunately for me/us Google is pretty smart. It’s noticed hundreds of new pages but the content was not new. They just display what I already have on my site but in another way. Pages were also full of keywords, which may appear irrelevant to displayed content. And what Google engineers do when someone tries to mess up with their algorithms? Penalize the website.

At the very beginning I’ve notices a small increase of my rankings. But 3 days ago there was a major drop. I think that then Google used “new” pages and recalculate my position in ranking for large number of keywords. And because I was “cheating” I was punished with much, much lower position than before.

Solution

The solution is always simple. We need to prohibit Google, and hopefully other robots, to visit those links. What I did is I’ve customized mentioned templates: portlet_tag_cloud_explorer.pt , tag_cloud_explorer_results.pt , tag_cloud_explorer.pt . For each tag I’ve added attribute rel=”nofollow” . This attribute, is I think Google’s invention, that extends HTML specification. Basically link with such attribute won’t be visited by a robot and linking page will not be indexed.

I bet that Ingeniweb team will quickly fix this and release another version.

The size of the problem

Because Google already indexed all those pages I had to disable access to them by adding ‘Disallow: /tag_cloud_explorer_results’ to my robots.txt.

Today I’ve looked at Google Webmaster Tools. And it shows that there were 9141 links blocked this way. My whole website contains maybe 50 pages so this can show the scale of the problem. I can only imagine the punishment from Google to a website claiming to have 9000 pages and only 50 (0.5%) with unique content.

Rate article
llakomy.com
Add a comment