In this article I will show you how it is easy to scrape a web site using Selenium WebDriver. I will guide you through a sample project which is written in C# and uses WebDriver in conjunction with the Chrome browser to login on the testing page and scrape the text from the private area of the website.

Downloading the WebDriver

First of all we need to get the latest version of Selenium Client & WebDriver Language Bindings and the Chrome Driver. Of course, you can download WebDriver bindings for any language (Java, C#, Python, Ruby), but within the scope of this sample project I will use the C# binding only. In the same manner, you can use any browser driver, but here I will use Chrome.

After downloading the libraries and the browser driver we need to include them in our Visual Studio solution:

VS Solution

Creating the scraping program

In order to use the WebDriver in our program we need to add its namespaces:

Then, in the main function, we need to initialize the Chrome Driver:

This piece of code searches for the chromedriver.exe file. If this file is located in a directory different from the directory where our program is executed, then we need to specify explicitly its path in the ChromeDriver constructor.

When an instance of ChromeDriver is created, a new Chrome browser will be started. Now we can control this browser via the driver variable. Let’s navigate to the target URL first:

Then we can find the web page elements needed for us to login in the private area of the website:

Here we search for user name and password fields and the login button and put them into the corresponding variables. After we have found them, we can type in the user name and the password  and press the login button:

At this point the new page will be loaded into the browser, and after it’s done we can scrape the text we need and save it into the file:

That’s it! At the end, I’d like to give you a bonus – saving a screenshot of the current page into a file:

The complete program listing

Also you can download a ready project here.

Conclusion

I hope you are impressed with how easy it is to scrape web pages using the WebDriver. You can naturally press keys and click buttons as you would in working with the browser. You don’t even need to understand what kind of HTTP requests are sent and what cookies are stored; the browser does all this for you. This makes the WebDriver a wonderful tool in the hands of a web scraping specialist.