Recently I decided to work with for running python scripts on JS stuffed websites.

Originally I tried to leverage the dryscrape library, but I failed to do it, and a nice support explained to me: “…unfortunately dryscrape depends on WebKit, and WebKit doesn’t work with our virtualisation system.”

A headless browser is by definition a web browser without a graphical user interface (GUI).

So they directed me to the Selenium + Firefox bundle as guided in this post.
In short, I installed pyvirtualdisplay, a python wrapper for Xvfb (stands for X virtual framebuffer), for running a display inside ‘X virtual framebuffer‘. So, initiating the Xvfb (buffer) causes the rendering of the Firefox browser into it, thus forcing output in it. This is the way a headless browser is simulated.

To install pyvirtualdisplay in a Bash console:python scraper bash console

From wiki: Xvfb or X Virtual FrameBuffer is a display server implementing the X11 display server protocol. In contrast to other display servers, Xvfb performs all graphical operations in memory without showing any screen output. From the point of view of the client, it acts exactly like any other X display server, serving requests and sending events and errors as appropriate. However, no output is shown. This virtual server does not require the computer running on to have a screen or any input device. Only a network layer is necessary.
“Headless” servers can use a virtual display like Xvfb to spoof apps like Firefox into running if there’s no real screen for them to actually be displayed on.

Now, having Selenuim, Firefox and running all the display inside ‘X virtual framebuffer‘, I composed the simple python scraper program based on Corey Goldberg’s code, Selenium and BeautifulSoup being preinstalled in pythonanywhere.