FMiner is another data extraction tool which has been on the market already for 5 years. Let’s see what features allow it to survive in the tough competitive struggle we have in the web scraping world.
The main distinctive features of this software are showing the scraping process in a visual way as a diagram and allowing you to record macros by navigating the web using the internal web browser.
Being written in Python, FMiner can be run both on Windows and Mac OS machines, and as it does scraping using the internal browser, it may be used also as a macro player for simulating user activity on the web.
|Easy to learn|
|Customer support||forum, email; custom project development|
|Price||$168 – $248|
|Trial period/Free version||15 days trial|
|OS (Specifications)||Win, Mac|
|Data Export formats||Excel, CSV, XML, SQLLite and other databases (Oracle, MS SQL, MySQL, Postgres, Access, OBDC)|
|Multi-thread||yes; running several web browsers simultaneously|
|API||supports custom Python code|
Let’s see how to scrape with FMiner. The main screen of the application is divided into four sections:
- Macro Designer
- Web Browser
- Action Attributes
- Logs, Data, Selections and Variables
The macro designer section displays a project flow chart where each action is represented by its own element. You can build your flow chart manually or by recording your actions in the browser on the right. Also you can rearrange and remove the elements as you need.
Here is a flowchart of the project that scrapes IP and Cookie information from our testing ground page, adds this information into the table named “output”, then clears cookies and scrapes this page again, adding the scraped values as a second row into the same table:
Here are the attributes of the IP element of the output table:
As you can see from this picture, the data element is defined using an XPath expression that you can edit either by yourself or by using some auxiliary functions like Select target (on the page), Relative selection or Expand/Shrink selection. Also you can work with groups of similar page elements (see the last button under the Target select title).
In the Extract type section you can define what type of data you expect to extract from the page. FMiner supports the following data types: text, html, dom attribute, page attribute, download, regular expression and static data.
When you click the Run button (on the toolbar) FMiner starts to execute your flowchart in the browser (on the right) and log its actions in the log window (right under the browser):
After it is finished you can see the resulting table in the data window:
Note that the program doesn’t delete the old data from the tables, so you can collect all data extracted from several scraping sessions in the same data table. Also you can manually edit or insert the data rows or even import outward data into the table from any XLS or CVS file. As soon as you’re satisfied with the result you can export it into an external file or database (see Export button on the toolbar).
Undoubtedly, FMiner has its right to life and certainly some customers will find it very useful for their web scraping tasks. In such a brief review I didn’t have the opportunity to cover all the features of this scraper, but it seems like after 5 years of development it has been polished up to solve many sophisticated web scraping tasks. I didn’t cover regex support, post extraction data adjustment, captcha solving and other goodies that are hidden in this scraper.
I definitely recommend that you try it, but unfortunately you have only 15 days to make up your mind. I would suggest to the developer to think about switching to the freemium model, which gives a user more freedom thus making him/her feel more comfortable and subsequently more loyal.
As always you’re welcome to ask any questions or share your experience related to the FMiner in the comments below.