Monday, August 3, 2009

How Search Engine Crawls Site

It is an interesting query for all SEO experts that when a search engine visits a particular website, what it sees and what are the paths of observation? Well, different person’s have different opinions about it. The total process has no end and it is very debatable topic also. I will discuss only some points and i will focus on Google search engine’s view of a particular webpage.


Search engine generates crawler or bots which are nothing but calculated and automated programs having some algorithms with fixed constraints and instructions to examine a webpage. A crawler on its first visit to a webpage firstly wants to seek out HTML pages and just ignore all the MIME types. In order to request only HTML resources, a crawler just hit on the HTTP HEAD request to determine a web resource's MIME factor before requesting the entire resource with a GET request. There might be many HEAD requests and to avoid making numerous HEAD requests, a crawler may alternatively examine the URL and only request the resource if the URL ends with .html, .htm or a slash. If the crawler finds all factors cleared there, then it sees the robots.txt page to check the instructions there. Robots.txt is human made text files where some internal files, folders, URLs and some images or other things are blocked and search engine crawler just ignore the pages.


Crawlers then come to other pages freely and examines according to some rules. It crawls the webpage from top to bottom and also from left to right extensively. It first looks for the URL of the site and then checks all Meta data as Title, Description, Keywords, total size of the file, all the texts on the page, total no of words with distinct words. Then comes to check all links on the pages thoroughly. We have to remember that crawler always like simple text content to move freely. Any complex design, scripts images with no ALT tags are avoided by crawler. So to keep a webpage crawlable, these rules should be followed by the page designers, programmers and obviously SEO experts.

1 comment:

  1. Thanks for the sharing of such information. we will pass it on to our readers. This is a great reading. Thanking you. seo news

    ReplyDelete