Crawlers, spiders, and robots

Crawlers, spiders, and robots

The query interface is the only part of a search engine that the user ever sees. Every other part of the search engine is behind the scenes, out of view of the people who use it every day. That doesn’t mean it’s not important, however. In fact, what’s in the back end is the most important part of the search engine.

If you’ve spent any time on the Internet, you may have heard a little about spiders, crawlers, and robots. These little creatures are programs that literally crawl around the Web, cataloging data so that it can be searched. In the most basic sense all three programs — crawlers, spiders, and robots — are essentially the same. They all “collect” information about each and every web URL.

This information is then cataloged according to the URL on which they’re located and are stored in a database. Then, when a user uses a search engine to locate something on the Web, the references in the database are searched and the search results are returned.