Technically curious

Google doesn’t agree with itself

google searching comic

It is always surprising when a company doesn’t agree with itself.

You probably know that if you type site:domainname.com you will see all of the pages that Google knows about and a total number of indexed pages. Well, this number varies both absolutely and relatively. What does that mean?

First, the number varies when you do the command at different times during the day. Sometimes it is hundreds more, and sometimes it is less. So not reliable at all.

Second, it varies absolutely with the number of pages/URLs that Google Search Console says are indexed. For example, right now it says I have 216 pages indexed in Google Search Console. Yet when I do the site:chimac.net in Google it comes back with 172 results.

Now at first, I thought this was a caching issue. That is, it takes time for front-end search engines to catch up with the back end of Google Search Console. However, the more I think about this the less sense it makes. Google Search Console is lazy and can take days before it updates. Plenty of time to integrate and report to the search engine what its current stats are.

Then I thought that the variance in the search results was just due to other changes happening in Search Console. Perhaps the removals were random and not bulk processed but I don’t think that is the case either.

This link suggests that the disparity is intentional in the search engine result so that search engine manipulation is harder to do. It’s a plausible argument. There are other ideas in that post that make it worth reading.

See also  Stories from my Past: Testing to see if something is true

It is clear that the Google Search Console is the authoritative number and that it doesn’t help to look at the search engine number. Too bad it is so convenient.