Re: Google gets confused by sitemaps and problem with canonicalpage?
Hehe Phil, I'm not that stupid to believe GWB ;) but rereading my post
your point is valid. Google does not rule the Web, no single company
should be able to do that.
However, because 60% or even more of all referrals come from Google,
and they clearly dominate search ATM, it often seems they can set
standards. I've no problem with sensible standards, they make my life
easier, and Google has introduced a lot of good stuff. Also, usually
Google's intentions are honest, just sometimes they fail to do the
right thing (e.g. the requirement of link condoms on paid links), but
that doesn't make Google evil.
*Not* doing particular things isn't evil or intransigent per se, and
the www vs. non-www thing is a good example to explain why.
In most cases Google can handle the sites available under both
addresses quite well, because most webmaster don't confuse the engines
with links in both variants, that is Google can determine the canonical
server name for a site and sticks with it.
If Google's crawlers find both variants in links on the Web, it becomes
tricky and Google sometimes fails in figuring out the intended
canonical name. As a matter of fact, in most cases it's an onsite link
starting a "canonical problem". That looks like "Google does not /
cannot handle all misconfigured or abused Web sites", where
"misconfigured or abused" means the existence of links in both
variants. I'd say *not yet* the majority, and here is why:
Google's procedures to determine canonical server names are primarily
based on standards, and the standards say that wwvv.example.com is not
equal example.com. Actually, Google does not handle this as set in
stone, that is Google engineers have developed routines to make an
educated guess when both servers are used in links. This guess of the
Webmaster's intention is often good, and sometimes wrong. It gets
improved constantly, for example the next algo update is expected to
solve a lot of the remaining issues. That's an everlasting process, and
there will always be some sites where the algo's guess fails, because
the 'thoughtless creativity' of Webmasters, editors and publishers is
infinite in finding new ways to confuse crawlers and indexers.
ISPs are "amazingly dumb" when they serve their customer's contents
from different addresses without a 301-redirect from one server to the
other. The ability to type in a URl without the www prefix is good for
users, but they should be redirected to the www-version, or vice versa.
Google is not "amazingly dumb" because they don't ignore those crappy
server configs, Google's handling of crap is just not (yet) perfect.
Bottom line is, that nobody is forced to follow Google's guidelines.
But if you want traffic from Google, you should play by Google's rules.
By the way, stupid crap brings troubles with other engines too.
This discussion group is dedicated to a Google product, which defines
the context in my book. So when I and others argue in the sense of
Google's rules and abilities, that does not mean we agree in every
case. It means that we try to help others to understand how one can
play this game successful.
Sebastian
0 Comments:
Yorum Gönder
<< Home