Scraping for Journalists by Paul Bradshaw is a handy book for non-programmers to master some basic scraping techniques with online scraping tools. For sure, this book does not and cannot embrace all the techniques and problems that arise with the practical scheduled business web extraction; instead, it guides common people through how to get and refine some open data.

  • This edition is a handy with clear how-to’s and computer web page literacy, especially many basic programming terms. Ch. 1 is free for download.
  • It introduces you to a range of scraping techniques – from very simple scraping techniques, which are no more complicated than a spreadsheet formula, to more complex challenges.
  • The edition is to be adjusted and improved with the comments made by readers, making the next iteration more useful and up-to-date.
  • The techniques discussed may be found in other on-line media such as Google Docs documentation, walk-throughs and others. This book gathers the latest free techniques for a journalist or non-programmer and describes the available tools (Google Spreadsheet, Google Refine app, Scraper Goolge Chrome extension, OutWit hub and others).
  • Scraping for Journalists is available now in 11 chapters, with about 20 chapters promised.

I recommend this book to those who want to expand his view in non-professional web scraping and those who want to increase his skills in daily data extraction for individual purposes.