In the post we share the differences between Crawler, Scraper and Parser.

Crawler is a web bot that visits a stack of web pages (one might call them nodes) and accumulates the links (urls) of the nodes, deriving new urls from each new web page [html] that it visits. Crawler might or might not get pages’ info in a data storage. It does not go deep (e.g. into detail pages) unless programmed explicitly.

Scraper is a bot that visits web pages of a given set of urls. It does not collect new urls (as a crawler does). It rather visits pre-collected urls and retrieves relevant data to store into a data storage.

Parser is an [offline] robot that processes or analyses given data to make of them proper data structures. It retrieves information from [unstructured] data, whether from data storage or directly from the web (eg. HTML). Consider the following html piece supposedly scraped of a certain web page by url=”https://battery-store.com/Batteries+Plus+Calcium-f4d67gh”:

Parser may make of it a useful data item:
[{"id":2345609,
"name": "Batteries Plus Calcium 12V 74Ah 680A battery AK-ZP57412",
"sku":"YU23809",
"price_us":"48.08",
"price_ca":"53.00",
"url":"https://battery-store.com/Batteries+Plus+Calcium-f4d67gh"
}]

Often a scraper includes the parser functionality in itself.

See the examples of simple email crawlers (Python, Java) and a scraping project where the scraper and crawler functionality go side by side. In that project a crawler gathers the [domain] urls and processes them based on whether it is a detail page or a search result.