Import•io is a big data cloud platform that has the ambitious goal of turning the web into a database.  It was founded in March, 2012, and a year later it received $1.3M in seed funding from Wellington PartnersLouis Monier and Emmanuel Javal.

So what is unique about this service? Let’s see what import.io is and how it differs from other web scraping software. First I will tell you about the goodies, and then I will mention some shortages.

Cloud-BasedA Cloud-Based Platform

First of all, import.io is a cloud-based platform. This means that you don’t need to run the scraper on your machine, and all your data is kept somewhere in the cloud. This has several advantages:

  • The scraping is going on even if you turn your computer off
  • You can access your data from any computer connected to Internet
  • You don’t need to be concerned about scraping process maintenance and scalability

FreeFree

Import.io is free, and they intend to keep following the freemium model even if they add some premium features in the future. It seems like they don’t want to merely earn some money by selling web scraping services, but rather to give people a tool that brings order to the web data, making it available to everyone.

SimpleDesigned Even for Non-Technicals

If you want to make web data available to everyone, you need to make your product usable for all kinds of people, regardless of their technical ability. This is how import.io is designed. The whole interface is built according to a point-and-click principle. The system tries to “guess” what you need and do it for you. If it misses the point you can adjust it manually.

DataAllows Fusing Your Data

Import.io does not only scrape data from the web, but also allows you to work with data, connecting one data source with another and thus producing new, valuable, real-time data sets.

APIOffers an API

If you’re a technical person, you may be interested in integrating the import.io database with other software. It’s possible, since it offers a REST API, allowing you to search and retrieve Objects from the Object Store.

They also offer client libraries for such languages as JavaScript, Mini JS, NodeJS, Java, C#, Python, Objective-C and Ruby.

SupportOffers a Great Support

If you get stuck using import.io you can contact their support. They really answer. I did contact them several times, and they solved my problems promptly.

WarningA fly in the ointment

Well, I have said many positive words about the import.io service. It’s time now to add a tiny fly into this ointment. Yes, there are shortages, of course, and I’ll mention some of them here (it’s up to you to decide how crucial they are).

Not easy to learn

Though this service is designed for people with any amount of technical ability, you still need to learn how to use it. When I tried to hurriedly master it, I was frustrated in many ways, though I have some experience in web scraping, but later, after additional studying and contacting support, everything was resolved. Of course, it’s not the hardest-to-learn web scraper, but there are many things that simply might not be evident to you right away.

Still in Beta

Import.io is in BETA. This means it still has a lot of things to be finished and polished, and many things may not work as you expect. So use it at your own risk.

Has limitations

If you try to build a scraper for laymen, it’s really hard to design it for scraping everything, though the import.io team does strive for this. Personally, I prefer low-level scraping where you can control what you send to the server and how you process its requests. This gives you keys for everything.

Import.io is different. You can’t control the scraping process in all details there. There are no additional control knobs to fine tune something or produce some hacks. That’s why you may face problems with scraping sophisticated websites that use a lot of AJAX or have a complicated layout.

Also they use a custom web browser for building a scraping model. It’s not perfect yet and sometimes may give you unexpected results.

Doesn’t solve CAPTCHA

It’s their principle that if a website uses CAPTCHA to protect its data, they do honor it. I agree with that and mention this here just to make you aware.

Conclusion

Import.io is a fast-growing, innovative data scraping platform. To gain beta testers they have made it totally free (at least for now), and you can take advantage of it if you’re ready to stay with them for better or for worse.

If it suits your need, then you’ve got a great cloud scraper for nothing! If not, feel free to tell them about it.

Also you’re still welcome to add your comments below. Share your experience with using this service; it may be useful to others!