Selenium NUnit crawler speed-up

I improved the speed of my Selenium link crawling algorithm by directly extracting the href URLs of all hyperlinks, instead of retrieving the hyperlinks by ID and querying their href attributes:

string sLinks = selenium.Eval(@"
var s = '', i = 0;
for(i = 0; i < window.document.getElementsByTagName('a').length; i++) {
    s = s + ' ' + window.document.getElementsByTagName('a')[i].href;

string[] rgsLinks = sLinks.Split(' ');

The string array now contains all URLs found in the current page.

As each call to the Selenium API is passed to the Selenium server, then to the browser, which evaluates it, and returns the result to the server, which passes it to the client, this approach is way faster than querying individual a.href attributes.

2 Responses to Selenium NUnit crawler speed-up

  1. Bob says:

    Newbie here. Just out of curiosity, is there an advantage to using your version over using window.document.links to gather all the hrefs? Or are they about the same?

  2. devio says:

    blame my inexperience in JavaScript and DOM…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: