Every web data extraction task consists of at least two steps: content extraction and parsing. We’re happy to introduce our free service that helps web-scraper developers in their daily work at scrapetools.com.
The idea is simple. Each time we create a web-scraper, we need to test regular expressions or XPath queries multiple times until we get the desired result. We realized that it would be nice to have an online service to do this recurring job in a more simple and convenient way. This is how scrapetools.com was born.
It contains several parts (or stages) that work together:
1. Get a Source HTML
In the first stage, define the source text (HTML, XML or whatever). You may directly type a text in the built-in editor or specify a URL to get it from. After you get the text from the external website, you’ll be still able to edit it:
2. Test Regular Expressions
In the next stage, you may test your regular expressions on the source text that you defined in the previous stage. After you evaluate your regex, you may see the result in three forms: “All Matches”, “All Groups” and “Named Groups”, as in this image.
3. Test XPath Queries
In this stage, you may test your XPath queries against the source text that you defined in the first stage. After you execute your query, you may see all the found entries along with their names and paths:
4. Generate a Code
In the last stage, you may see the resulting code in PHP that follows all the steps we described above. This may be useful, not for beginners only, but also for experienced developers who want to save time typing code:
That’s it for now! This service is still under development and improvement, so all your feedback and suggestions are highly appreciated. We really want to know what you think about this service.
Feel free to add any comments here or write us directly. Thanks!