A Beginners Guide to How Search Engines Work

Ever wondered how web sites get into Search EnginesLike Google, Yahoo, Bing, etc, in the first place?

Search Engines use a computer programme called a crawler (or a spider, robot or 'bot). It lives on the Search Engine's own serversa physical computer dedicated to running one or more services, to serve the needs of programs running on other computers on the same network - in this case the internet. Its only job is to crawl around the Web and save everything it finds.

The crawler starts by visiting web sites it already knows and it’ll follow any links it finds as it moves from page to page and web site to web site.

At each web site it visits, it copies all the HTML codeHTML, which stands for HyperText Markup Language, is the predominant markup language for web pages from every page it crawls and saves it to its own servers. 

It's unusual for a web site to be completely crawled in one visit from a Search Engine crawler. It's more common for a web site to be crawled a few pages here and a few pages there. 

Then the Search Engine’s indexing server takes the HTML code, examines it, parses it, filters it and analyses it. Each Search Engine uses different algorithmsa process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer to do this.  Next the crawled pages from the web site are saved into the Search Engine's index. 

Now it's ready to be served up as a search result.

Total time elapsed? About two minutes!

It’s a rapid process… only if the Search Engine crawler can find your web site in the first place.

Just because a web site exists - doesn't mean a Search Engine can find it or really understand what it's really about. Many web sites are made in such a way they are nearly impossible for Search Engines to find, understand or navigate.

Image Credits: Network image by Tagtraum. Robot image by Sasan. Compiled by me, Pam McCormac.