Google is pointing to node number instead of NiceUrl Options
acl
Posted: Thursday, February 21, 2008 3:14:57 AM
Rank: Devotee

Joined: 12/15/2006
Posts: 40
Hi,
We just noticed that with one of our pages Google thinks the name of the page is the page's node number:
http://www.google.com.au/search?source=ig&hl=en&q=car%20lease&meta=

Our company is called Stratton Finance and should show up a few results down. Notice that Google is linking to 1151.aspx. The page should be /car-finance/options/finance-car-lease.aspx

I've searched everywhere and can't find any pages that link to 1151.aspx. Looking through our usage tracking it looks like everyone hitting 1151.aspx is coming from outside our website.

This is a bit of a problem for numerous reasons. Can anyone imagine how this happened?
mortenbock
Posted: Thursday, February 21, 2008 7:54:49 AM

Rank: Addict

Joined: 7/19/2006
Posts: 791
Location: Århus, Denmark
I've seen this happen when people were using the RSS package, but that doesn't seem to be the case here?

I could also have happened if a node was moved to another parent node, then it has to be publsihed in order to get it's new niceurl. If it wasn't published right away, maybe google indexed the 1234.aspx url.

I think it would be fairly simple to fix the indexing of these pages. Either by using a google sitemap (i think you can tell it which pages _not_ to index), or by writing an httpmodule or a usercontrol that checks to see if the url is of the type 1234.aspx, and the redirects (301) to the niceurl of the page with that ID.

Morten Bock - Level 2 certified - MVP 2008/2009 - My danish blog with a few english posts

SoerenS
Posted: Thursday, February 21, 2008 9:25:49 AM

Rank: Fanatic

Joined: 7/25/2006
Posts: 424
Location: Silkeborg, Denmark
Strange, apparently Google has indexed several of those pages at your site.

I know for a fact that this issue arrise if you use the RSS package (as Morten points out), and I urge everyone to go vote for a fix on Codeplex.

Have you at some point in time played around with the RSS package?

You can also use the XENU Broken Link checker for locating troublesome, internal links, in case you have an XSLT macro somewhere that doesn't use the NiceURL function properly.

If the hits are coming in from external links that you have no way of identifying, I suggest the following:

1) Have a look at the Umbraco rewrite engine, and rewrite those numeric urls to the "real" ones.

2) Use the Google Webmaster Central to remove those URL's from the Google index.

3) Add the numeric URL's to a robots.txt file, thus telling Google to stop indexing them. You'll loose a bit of link juice (from those external sites that link to the incorrect URL's), but that doesn't seem to be a big concern in your case.

/SoerenS

Brug for råd til hvordan du driver en god webshop? / Need advice on how to run an effective webshop?
SoerenS
Posted: Thursday, February 21, 2008 9:30:28 AM

Rank: Fanatic

Joined: 7/25/2006
Posts: 424
Location: Silkeborg, Denmark
SoerenS
Posted: Thursday, February 21, 2008 12:11:35 PM

Rank: Fanatic

Joined: 7/25/2006
Posts: 424
Location: Silkeborg, Denmark
I forgot to mention, here's the link where you can vote for getting the RSS package fixed.
(http://www.codeplex.com/umbracoext/WorkItem/View.aspx?WorkItemId=9193)

I would love to release a hotfixed package myself (I've allready fixed it on my own blog), but I'm quite busy atm. with other projects.

/SoerenS

Brug for råd til hvordan du driver en god webshop? / Need advice on how to run an effective webshop?
acl
Posted: Friday, February 22, 2008 7:34:20 AM
Rank: Devotee

Joined: 12/15/2006
Posts: 40
Thanks for your replies (and putting in a bit of work too :-) that was awesome).

I don't think anyone was playing with the RSS feed or moving the pages, so I'm not sure what the problem could be.

I'll have to implement the HttpModule to do 301 redirection - I thought this would be a stuff that Umbraco should be doing anyway.

Thanks again
psterling@homax
Posted: Friday, February 22, 2008 7:55:45 PM

Rank: Fanatic

Joined: 10/30/2007
Posts: 215
Location: Bellingham
You can add in URL Rewriting rules to the .../config/UrlRewriting.config file quite easily to address the immediate concern. Also, you can add in a 404 handler to catch any 'broken' links from referrers to the .../config/umbracoSettings.config file as below - just substitute the desired target nodeID for nnnn:

...
<errors>
<!-- the id of the page that should be shown if the page is not found -->
<error404>1051</error404>
</errors>
...

-Paul

motusconnect.com :: level-2 certified :: MVP 2008/2009
Users browsing this topic
Guest


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.