This is an independent review of a brand new startup related to web data extraction and called “Kimono Labs“. It was born 16 January 2014 in California. At the moment it looks like a newborn baby: not mature, but pretty.
The Main Idea
The main idea of Kimono Labs service is very simple: “to turn a web site into API“. Though the idea itself is not new (we already shared about Web Scrape Master service that turns “any” website into JSON), the implementation looks very modern and promising. You can not only generate raw data (in JSON, CSV or RSS format) from a web page, but even instantly create a web application (hosted on kimonolabs.com of course) to make those data available on the web!
The workflow is quite straightforward. When you want to “kimonify” (this is how they refer to it) some web page you simply need to click a special bookmark and then select all the pieces of data you are interested in:
The items you selected become collections
which may be represented in JSON, CSV or RSS format
or they can even become a web application:
As I mentioned before, all that you need to do to start building your Kimono Labs API is to click the “kimonify” bookmark on your browser’s bookmarks bar:
Then you will be redirected to the Kimono website which will show your target web page with a special toolbar added at the top:
As you can see from this picture, you can manage data types (parameters) of your Kimono API using this toolbar. To specify where a parameter is located on the web page you simply need to point there:
Using × and √ buttons you can teach Kimono scraper what series of elements you need and what is unnecessary. If you select a portion of text, this also may help Kimono scraper to understand what exactly you need.
After you are done with building your API, you will be given a special URL where you can access it. Here is an example of PHP code that extracts data using this API:
$request = "http://www.kimonolabs.com/api/a6892yre?apikey=8e1d02070887f01df10ce90b7bffd3f1";
$response = file_get_contents($request);
$results = json_decode($response, TRUE);
Kimono will also provide you other examples in jQuery, Python, Ruby and Node.js. You can even get an HTML code for embedding into your web site to get something cool like this:
Ok, I think you already have a general impression about what Kimono Labs Web Scraper is and how it works. Let’s look at the restrictions now.
Restrictions of Kimono Labs Scraper
In its current implementation this service can be used to scrape pages with a good structure and static content (no AJAX) only. You can neither follow links nor fill out forms with Kimono. Though they have already implemented some POST and GET parameters, it’s still not enough for serious web scraping.
When I say that Kimono can scrape pages with a good structure only, I mean that you can only use CSS selectors and regular expressions to extract your data from the page (this is available in Advanced mode of data Mode View). If you can’t do it by these means, it is probably not possible to do it at all.
Also there is no post-scrape data processing in Kimono scraper. For example, there is no way to distill your data after they are extracted from the page. They simply go to the output as is.
Kimono Labs has a great interface and usability (seriously, I like it) and is good for periodic scraping of plain, simple and static web pages. It may fit for scraping stocks, weather forecasts, news and other similar types of data available at static addresses. It is designed to periodically scrape the same data and show it somewhere in another form (by the way, if you need this on your desktop, you can also do it with my free Handy Web Extractor). But if you need to scrape something more complicated you probably will need another scraper.
Update: the have added some useful features (including pagination and crawling).