In this post we’d like to share an interview with a young service called ScrapeHero. We’ve interviewed Tony Paul (marketing head) and this is what he had to say.

Q: Why did you start ScrapeHero, is it a pure business goal?

A: ScrapeHero was started to provide the neweb scraping talks thumbnailxt generation and evolved incarnation of scraping services.

The landscape of scraping services prior to DaaS covered a huge spectrum of freelancers, scraping consulting companies of 2-10 people, desktop scraping software (download and do it yourself), web based software (self service – no real download needed but no human interaction from the provider) and lately, venture funded companies providing free browser based plugin software coupled with an infrastructure for self service data gathering.

Since scraping anything but the easiest sites is easy using self service tools there was an obvious need for a fully managed service that filled the “X as a Service” (XaaS) paradigm for Scraping.

ScrapeHero was born to fill that Data as a Service (DaaS) provider void that so desperately needed to be filled.
We just provide clean unadulterated data without our customers having to worry about software, servers, bandwidth, scalability, xpaths, css selectors etc.

Q: What is the future of web scraping in your view?

A: The future of web scraping, as we see, is evolving from “just get me the data” to “help me put this data to use”, “give me insights into the data”. We are seeing a lot of need in this space and we are furiously enhancing our platform to meet exactly these needs from our customers.

Q: What impact do you think the anti-scraping technology will have on the scraping industry?

A: Anti-scraping technologies obviously have a market that feed the fear of “it is my data and I don’t want anyone to have it”. This fear is rooted in traditional business fears of protectionism.

However, the landscape is changing as younger people are running companies as evidenced by Tesla Motors dropping a bombshell by providing their technology openly. Prior to that many companies shared a lot of data publicly including salaries and detailed blogs about their operations and of course providing their code as open source code but the Tesla news from a traditional industry was really big news. You cant steal whats free and freely available – pure genius in our view. ScrapeHero also open sources its code in the same spirit.

Open data is obviously the future, but the fear will never go away.

The impact to our industry will be minimal to negligible because every effort has a price – some companies will keep paying exorbitant prices for this protectionism and others will simply not play this cat and mouse game. Apple and jailbreaking is a classic example of blocks and breaks and for Apple, which has probably the most cash in the world, it hasn’t yielded much success.