Working around the focus-click problem in Selenium

5 minute read

Sometimes Selenium can lose browser focus when running tests, which can cause timeout errors when testing click events in a web application. Here I discuss a workaround I used to get the acceptance test suite on a project to work reliably.

The problem: wonky tests

There was an interesting problem I had at work a while ago: the Selenium tests in a project would time out–and hence fail–not consistently, but in seemingly random locations and some runs wouldn’t even fail at all. It was sort of like a Heisenbug that would appear in random tests during each test run; it was all very frustrating. Finding a workaround to the problem, however, made the problem an interesting one.

The test failures were timeout failures, they occurred seemingly randomly, and they also seemed to depend upon the development environment running the tests, it was a bit hit-and-miss getting the problem to turn up so that one could debug it. For instance, the tests might run fine on my laptop, but then barf on Jenkins. The subsequent Jenkins run might then pass and hence it looks like everything’s ok and it’s just some transient issue on the server. But then the problem turns up again. Randomly. Aargh.

Selenium tests can be painfully slow at times1, so it was fairly arduous trying to work out from the traceback on Jenkins what the issue was. After many tries (e.g. increasing the timeout value, changing the implicit wait value, etc.) I happened to notice that the problems were happening around click events, for instance when I wanted to simulate the user clicking on a button. A bit of searching on Google and StackOverflow lead me to a couple of answers which provided sufficient information to solve the problem.

The workaround: click on non-active ancestors

It turns out that sometimes Selenium loses browser focus and hence a click event just re-focusses the browser rather than clicking on the element one wanted to interact with, for instance a button. An interesting point here is that the element is still visible to Selenium (i.e. calls such as element.is_displayed() evaluate to true), which means that a simple check for visiblity isn’t sufficient to solve the issue.

The tips that lead me to the workaround were:

The important point turns up in the “webdriver click not firing” post:

A workaround to this is to send a .Click() to another element on the page, so that the browser gets the focus, before attempting to click the link, e.g. it’s parent

Assuming that we know which element we want to click (let’s call the variable element just to be really original), we can translate the above advice into something like the following XPath call to get the parent element:

element.find_element_by_xpath("..").click()

In other words, in a similar manner to traversing a directory structure on a file system, one can use .. to mean the “parent” of the current element.

And that’s it, right? Unfortunately not, because there’s a problem: sometimes the parent element is also a “clickable” element and hence clicking on it to regain focus can cause unintended behaviour if the browser actually is in focus. The solution is to click on a non-active parent element within the DOM; this way the click event doesn’t do anything if the browser is in focus, but has the intended effect of putting the browser back into focus if Selenium has managed to lose it.

In the application where this issue appeared, we use Bootstrap for styling and hence there were lots of <div> elements to choose from. The <div> element is handy in this case because it’s a non-active element, so clicking on it from Selenium won’t do anything other than regain browser focus.

The question then becomes: how to find the appropriate element? The solution I ended up using was:

element.find_element_by_xpath("./ancestor::div[@class='row']").click()

which used the very handy ancestor axis to find all parent elements of the current element. The ancestor axis syntax is also part of the “Unabbreviated Syntax” in XPath, which helps make the XPath expression more self-explanatory.

Let’s pull the XPath search pattern apart.

  • . (“dot”): The current node, also known as the context element. It’s necessary to specify a starting point for the XPath search and we want to start from the current node in this case.

  • / (forward slash): This is called the path operator and is in some sense very similar to the Unix directory path separator. This operator helps to set up the path through the DOM that we want to search along. The left-hand-side of the path operator has to be a node, hence the necessecity to use the dot beforehand.

  • ancestor::div: Select all <div>s that are ancestors of the current node.
  • [: Start the filter expression.
  • @class='row': Filter for the <div> with the attribute class='row'. This could also be written as attribute::class='row' 2.
  • ]: End the filter expression.

Putting this into an English sentence, this says:

Starting from the current node, get all ancestor nodes that are <div>s and filter these on nodes that are Bootstrap “rows”.

By using find_element_by_xpath rather than find_elements_by_xpath (note the ‘s’ in ‘elements’) we get the first such ancestor element. Because it’s a <div> we know that it’s not active and we can click it when the browser is in focus without it doing anything we don’t want it to do, and when the browser isn’t in focus, it focusses the browser for us. Cool! Problem solved, er, worked-around!

Wrapping up

I find it interesting how often the solution to a problem takes one on a path of discovery and in the end the solution isn’t as interesting as what one learned along the way. In this case, digging around in the XPath documentation showed just how amazingly powerful this tool can be.

Although what I did here didn’t actually solve the underlying problem (for instance, it’d be good if Selenium were able to keep browser focus), it nevertheless allowed my tests to become reliable again, which is a huge help when developing an application.

Is there anything that I’ve missed? Was this post helpful? How could I make it better? Let me know in the comments section or simply drop me a line via email or ping me on Mastodon.

  1. Although they’re great for running acceptance and functional tests, especially to mimic the workflows users are likely to have when using the application. 

  2. I guess the mnemonic one could use here to remember that the @ symbol means “attribute” would be: @-tribute.