web scraping

Choosing affordable residential proxies for web scraping

Proxies are an integrated part of most major web scraping and data mining projects. Without them, data collection becomes sloppy and biased. This is why it’s essential to know how to find the best affordable proxies for any web scraping project.

One of the best proxy types you could use for scraping is residential proxies. In this post, you’ll learn what they are, how they are priced and what to look for before committing your project’s budget.

more…

ScrapingBee, an API for web scraping

scrapingbee_small logo

The web is becoming increasingly difficult to scrape. There are more and more websites using single page application frameworks like Vue.js / Angular.js / React.js and you need to use headless browsers to extract data from those websites.

Using headless Chrome on your local computer is easy. But scaling to dozens of Chrome instances in production is a difficult task. There are many problems, you need powerful servers with plenty of ram, you’ll get into random crashes, zombie processes …

more…

Octoparse: how to extract GPS coordinates from Google Maps

octoparse-logoHave you ever thought you could make money by knowing how many restaurants there are in a square mile? There is no free lunch, however, if you know how to use Google Maps, you can extract and collect restaurants’ GPS’s and store them in your own database. With that information in hand and some math calculations, you are off to creating a big data online service.

Octoparse - Black Friday November 2019 more…

The present trends in web scraping tools

Recently I got a question from one of the blog readers. After I replied to it, I decided to share it with a wider audience.
Question:

Hi,

I found your scraping.pro site and found it very helpful, then realized the web scraper solutions rating was from 2014.  What is the best solution for today?   I have lots of sites I need to scrape, mainly search then drill-down sites.   I would like to be able to schedule the scraping to run on a daily basis.  Is there a direction you could point me?  I’m a seasoned developer by trade but am seeing all these point and click solutions (e.g. import.io) and am wondering if I should stick with Node.JS or .NET or if I should investigate some of these GUI scrapers of today.

more…

Scraping with free or paid proxies – what is the difference?

Anything free always sounds appealing. And we are often ready to go an extra mile to avoid expenses if we can. But is it a good idea to choose the free option when it comes to using proxies for data scraping? Or should you stick to the paid ones for better results?

Let’s weigh all the pros and cons to see why you should consider using residential IP providers like Infatica, Luminati, NetNut, Geosurf and others.

more…

A revolutionary web scraping software to boost your business

octoparse-web-scraping-templatesIf you were an Amazon seller, would you want to know the listing price of a product of all competitors? Since you don’t have direct access to the Amazon database, you are out of luck and have to browse and click through every listing in order to construct a table of sellers and prices. A web scraping tool comes in handy. It automatically downloads your desired information such as product name, seller’s name, price, etc. However, web scraping that requires coding skill can be painful for professionals in IT, SEO, marketing, e-commerce, real estate, hospitality, etc.

It seems beyond one’s job description if he/she needs to learn how to code in order to obtain certain useful data from the web. For example, I have a friend who graduated in Mass Communication and works as a content marketer. She wants to scrape some data from the web, so she decided to learn Python herself. It took her two weeks to come up with a page of messy codes. Not only did she waste time on learning Python, but she also lost the time she could have used for doing her real work. more…

Crawling web pages with Netpeak Spider in conjunction with NetNut and GeoSurf proxies

NetpeakSpider-logo-owlAgreed, it’s hard to overestimate the importance of information – “Master of information, master of situation”. Nowadays, we have everything we need to become a “master of situation”. We have all the needed tools like spiders and parsers that can scrape various data from websites. Today we will consider scraping Amazon with a web spider equipped with proxy services. more…

JavaScript rendering library for scraping javascript sites

logo-js-rendering-libraryCan you imagine how many scraping instruments are at our service? Though it has a long history, scraping has at last become a multi-lingual and simple approach. Unfortunately, there is a list of non-trivial tasks which can’t be resolved in a snap.

One of these tasks is scraping javascript sites, those that output data using JavaScript. Facing this task, classic scrapers (not all of them though) ignore JS-data and continue their own life-cycle. However, when this little defect becomes a big trouble, developers all over the world take measures. And they did it! Today we consider one of the most awesome tools which scrapes JS-generated data – Splash. more…

Back to top