Kimono Labs Logo This is an independent review of a brand new startup related to web data extraction and called “Kimono Labs“. It was born 16 January 2014 in California. At the moment it looks like a newborn baby: not mature, but pretty.

Kimono service shutting down on February 29, 2016. You might get benefit from a lightweight version of kimono that runs on desktop PC, available for download at here.
Let’s get a bird’s eye view right now! I will use the testing ground page in my examples.

The Main Idea

The main idea of Kimono Labs service is very simple: “to turn a web site into API“. Though the idea itself is not new (we already shared about Web Scrape Master service that turns “any” website into JSON), the implementation looks very modern and promising. You can not only generate raw data (in JSON, CSV or RSS format) from a web page, but even instantly create a web application (hosted on kimonolabs.com of course) to make those data available on the web!

The Workflow

The workflow is quite straightforward. When you want to “kimonify” (this is how they refer to it) some web page you simply need to click a special bookmark and then select all the pieces of data you are interested in:

Kimono Extractor View

Note, though, that you can’t follow links on the page when you’re selecting the items.

The items you selected become collections

Kimono Data Model View

which may be represented in JSON, CSV or RSS format

Kimono Raw Data View

or they can even become a web application:

Kimono Application

The Implementation

As I mentioned before, all that you need to do to start building your Kimono Labs API is to click the “kimonify” bookmark on your browser’s bookmarks bar:

Kimono Labs Bookmark

Then you will be redirected to the Kimono website which will show your target web page with a special toolbar added at the top:

Kimono Toolbar

As you can see from this picture, you can manage data types (parameters) of your Kimono API using this toolbar. To specify where a parameter is located on the web page you simply need to point there:

Kimono Labs Page View

Using × and √ buttons you can teach Kimono scraper what series of elements you need and what is unnecessary. If you select a portion of text, this also may help Kimono scraper to understand what exactly you need.

After you are done with building your API, you will be given a special URL where you can access it. Here is an example of PHP code that extracts data using this API:

Kimono will also provide you other examples in jQuery, Python, Ruby and Node.js. You can even get an HTML code for embedding into your web site to get something cool like this:

Ok, I think you already have a general impression about what Kimono Labs Web Scraper is and how it works. Let’s look at the restrictions now.

Restrictions of Kimono Labs Scraper

Disclaimer: I understand that this service is quite new and is under active development. So, probably some points mentioned in this section may not be actual any longer at the time when you read this article.

In its current implementation this service can be used to scrape pages with a good structure and static content (no AJAX) only. You can neither follow links nor fill out forms with Kimono. Though they have already implemented some POST and GET parameters, it’s still not enough for serious web scraping.

When I say that Kimono can scrape pages with a good structure only, I mean that you can only use CSS selectors and regular expressions to extract your data from the page (this is available in Advanced  mode of data Mode View). If you can’t do it by these means, it is probably not possible to do it at all.

Also there is no post-scrape data processing in Kimono scraper. For example, there is no way to distill your data after they are extracted from the page. They simply go to the output as is.

Verdict

Kimono Labs has a great interface and usability (seriously, I like it) and is good for periodic scraping of plain, simple and static web pages. It may fit for scraping stocks, weather forecasts, news and other similar types of data available at static addresses. It is designed to periodically scrape the same data and show it somewhere in another form (by the way, if you need this on your desktop, you can also do it with my free Handy Web Extractor). But if you need to scrape something more complicated you probably will need another scraper.

Update: the have added some useful features (including pagination and crawling).