crawling

Data extraction: web crawling vs. web scraping in E-commerce

Nowadays, when one has some questions, it comes almost naturally for us to just type it in a search bar and get helpful answers. But we rarely wonder how all that information is available and how it appears as soon as we start typing. Search engines provide easy access to information, but web crawling and scraping tools, which are not such well-known players, have a crucial role in wrapping up online content. more…

Make crawling easy with Real Time Crawler of Oxylabs.io

logo-oxylabs-ioNowadays, it’s hard to imagine our life without search systems. “If you don’t know something, google it!” –  is one of the most popular maxims in our life. But how many people use Google in an optimal way? A lot of developers use google commands to get needed answers as fast as it possible.

Even this is not enough today! Large and small companies need terabytes of data to make their business profitable. It’s necessary to automate the search process and make it reliable to satisfy the user with fresh news, updates or posts. In today’s article we will consider a very helpful tool – Real-Time Crawler (RTC) for the collection of fresh data. Let’s start! more…

A Simple Email Crawler in Python

Email Crawling I often receive requests asking about email crawling. It is evident that this topic is quite interesting for those who want to scrape contact information from the web (like direct marketers), and previously we have already mentioned GSA Email Spider as an off-the-shelf solution for email crawling. In this article I want to demonstrate how easy it is to build a simple email crawler in Python. This crawler is simple, but you can learn many things from this example (especially if you’re new to scraping in Python). more…

Kimono scraper is now enhanced with Pagination, Crawling and Data History

Several days ago I wrote the Kimono scraper review  where I mentioned that the service is short of pagination support and some other important functions. But it is fair to say that this service is developing quite rapidly and now they have not only added the ability to go over several pages and URLs, but even to keep the history of scraped data. Lets look at these features closely. more…

Back to top