sg: Re: Double Slashes and Duplicate Content Pages

Oups, I've lost this thread ... got a reminder from Seoroundtable.com
today:
http://www.webmasterworld.com/forum30/32467.htm

Well, I didn't say that just because it's wrong Google's crawlers do
the right things or so ;) I'm not aware of any bug-free software.

I still think that all this would not happen w/o relative links
(especially in framesets), which can get misinterpreted, respectively
once mashed up can produce malformed references (/ vs. ./ and so on is
a nightmare with some bots, /=>// ./=>/./ ...). Once the first invalid
URL is out and 'known' to Google, the result can be a chain of
malformed URLs. Fragment identifiers, used in relative on-page links,
can result in invalid URLs if the bot doesn't handle encoding properly,
that is if it doesn't recognizes the # (unicode in ascii/latin docs and
alike).

Another possibility is a spammer running your stuff miserably scraped
for a few weeks, crawled by Googlebot only, before Slurp etc. found the
crap. Unfortunately there is absolutely no way to find each and every
link/unlinked URI on the Web, so I'd be careful with statements like
"there is only one link pointing to ...".

There are just too many possible causes, including a Googlebug ;),
perhaps you'll never track it down :( Sorry my post wasn't helpful.

Sebastian

sg

Önceki Mesajlar

Pazartesi, Aralık 19, 2005

Re: Double Slashes and Duplicate Content Pages

0 Comments: