Selenium NUnit crawler speed-up

I improved the speed of my Selenium link crawling algorithm by directly extracting the href URLs of all hyperlinks, instead of retrieving the hyperlinks by ID and querying their href attributes:

string sLinks = selenium.Eval(@"
var s = '', i = 0;
for(i = 0; i < window.document.getElementsByTagName('a').length; i++) {
    s = s + ' ' + window.document.getElementsByTagName('a')[i].href;

string[] rgsLinks = sLinks.Split(' ');

The string array now contains all URLs found in the current page.

As each call to the Selenium API is passed to the Selenium server, then to the browser, which evaluates it, and returns the result to the server, which passes it to the client, this approach is way faster than querying individual a.href attributes.

2 thoughts on “Selenium NUnit crawler speed-up

  1. Newbie here. Just out of curiosity, is there an advantage to using your version over using window.document.links to gather all the hrefs? Or are they about the same?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.