When working with different scrapers in python, we often need to run them detached from the main process and monitor their output in real-time. Here’s how we do this:

  • subprocess.Popen() launches a script as a detached process.
  • stdout = f  redirects stdout into file.
  • Python’s “-u” parameter gives us fresh written data, the buffering being “off”.

We run an OS command invoking a scraper script to run in parallel to the main process. Subprocess.Popen() is  the best procedure to run a script in parallel. When getting ‘fresh scraped’ data from file, the buffering should be “off” (the second parameter “-u” denoting unbuffered binary stdout and stderr). Alternatively, the Python environment variable PYTHONUNBUFFERED is to be ‘non-empty string’. For more options of non-buffering output in Python, check here.