Monitoring synthetic web transactions to reproduce and track realistic workflows can be an integral part of evaluating website changes, benchmarking competitors and ensuring site performance. However, writing scripts to replicate these transactions can be tricky and terribly frustrating. If you use a recorder extension, you may need to patch up recorded scripts with additional commands. Once you have the Web Transaction test running, it may even break again if a web page you're monitoring changes. We've recently written an introduction to monitoring synthetic web transactions with some basic tips for recording and revising Selenium scripts. Keep reading for more advanced tips on writing robust scripts and avoiding most transaction script-related grief, including pointers related to:
- Dynamic IDs
- Page components like iframes, popups and modal windows
- Switching windows
- Avoiding bot detection
- Using the sleep command
Before Starting
Before we dive into any advanced tips, it's important to remember the fundamental difference in the timing of transactions between recording scripts with a recorder extension like the ThousandEyes Recorder and running scripts on an Enterprise or Cloud Agent. When you record scripts with a recorder extension, your interactions with the page are decidedly human—you navigate your mouse pointer to various buttons, letting time elapse between clicks. Sometimes the recorder won't catch every action required to reproduce the transaction. In contrast, Selenium scripts run from the agent are fired off almost instantaneously, sometimes before the target is even present or without the required mouse movement if the script is incomplete. Keeping in mind these differences in recording and testing environments will help give you an intuition for revising your transaction scripts so that they reflect your intended interactions.
Also make sure that you record transactions in incognito mode. This will most closely match how agents run transactions, since agents clear their caches before each test run. To enable the recorder extension in incognito windows, navigate to chrome://extensions in your Chrome browser and check ‘Allow in incognito' for the ThousandEyes Recorder.
Dynamic IDs
One of the most common obstacles to a working Transaction test is identifying elements with dynamic IDs from frameworks like ExtJS. Element IDs can change in a number of different ways, and you'll need to use trial and error to fully understand how the IDs are changing (outside of contacting the web development team). Wherever possible, use XPath syntax that identifies targets by matching on unique static variables—by using relative references rather than absolute paths, it will reduce the likelihood of your transactions breaking when the page changes.
Below, we'll explore a few of the types of dynamic IDs our Customer Success team has most frequently encountered.
IDs that change with mouseovers
Some elements' IDs, classes or other unique attributes may change if a mouse pointer hovers over it. If you use the ThousandEyes Recorder to click on such an element, it may identify the element using, for instance, an ID that changes with mouseovers. When you run the test in an agent, it will fail because there is no pointer hovering above the element to change the ID to match the one in your script. In the example from Salesforce below, hovering over the magnifying glass icon causes its class to change from lookupIcon to lookupIconOn.
As a result, the following recorded script will fail in a transaction test because there is no pointer present to change the element's identifier to match the one recorded.
There are a few possible solutions to this problem. The first would be to simply revise the script so that it reflects the element's ID when a mouse is not hovering over it, as would be the case when the agent is executing the script. (We've also added a wait condition that ensures the icon is present before clicking.)
You could also try scripting in a mouseOver command before clicking so that the ID changes accordingly. Alternatively, you can change the way you identify the element so that the identifier is something that doesn't change. For example, if the ID of the element changes with mouseovers, you may be able to use its class as a static identifier instead.
IDs that change with multiple refreshes, logins, etc.
IDs can also change in more unpredictable ways—with page refreshes and user logins, for instance. Unfortunately there's no all-encompassing solution here; the best fix will depend on the web page and how the ID changes. It will be important to understand what and how identifiers change, and which ones don't, as you troubleshoot your transaction tests.
On Yahoo's homepage, the element that reads "San Francisco, CA" has a dynamic ID that changes with every refresh. For this element, a recorder might give a target of "//*[@id="yui_3_18_0_4_1456425760331_1139"]", but since the ID changes with each refresh—another refresh gives a different ID of "yui_3_18_0_4_1456426367980_1206"—a transaction test scripted with the original target would immediately break.
The best solution is to find an alternative static identifier, like the element's class or its text, that doesn't change across events like page refreshes. In this case, the element's class remains static; our revised script is below.
If you can't find a static identifier, the next-best solution is to essentially script directions for navigating through the hierarchy of the web page to arrive at the desired element. Starting at the root of the DOM, you'd use XPath to navigate through the relevant "branches," using their places in the HTML hierarchy to get to the target. This method will continue to work only as long as the hierarchy of the web page stays the same, so it's not a good choice for pages that change frequently.
Though this method is inferior to using a static identifier as shown above, see below for an example of navigating through the HTML hierarchy to identify the same element on Yahoo's homepage.
Navigating Unexpected Page Components
Sometimes you'll encounter objects on a web page that you don't expect. See below for several examples of objects you may have to navigate past in your transaction scripts.
iframes
An iframe is an inline frame that's used to embed another document within the current HTML document, and we've encountered them in Salesforce and VMware applications. If you want to script an interaction with an element within an iframe, you'll first have to "enter" that particular frame using the selectFrame command and then perform the action. Otherwise, the agent won't be able to locate the desired element. To move back to the main frame, script a selectFrame command with target, "relative=top".
Our example comes from Hernando County's Public Inquiry System. As we can see In Chrome DevTools in Figure 3, the ‘Continue' button is contained within an iframe with ID, "AppFrameNoFramework".
The script must first select the relevant iframe before clicking the button. Note that we include wait conditions to ensure that the element is present before any action is taken on it.
Popups
For many websites, including retail and flight booking sites like CheapOair, you'll encounter a popup, perhaps a signup form, on your first visit. Because the agents clear their caches before each test run, they'll encounter the popup every time. Enterprise agents have the option to run tests without clearing their caches so they'll encounter the popup only on their first visit, but otherwise you'll need to close the popup to continue your transaction successfully. You'll need to find the ID for the close icon and script in a click command.
Below, see the script that closes this popup.
Modal windows and other obstacles
A modal window is a child window that, when present, doesn't allow interactions with the parent window. If it does require an interaction like clicking an OK button to close the modal, simply script the required action to continue with your transaction, just as we showed for popups.
However, some modals (like loading windows), in addition to other obstacles like spinning wheels, don't require interaction but won't allow you to interact with the parent window until it disappears on its own. The best solution to avoid timing out your transaction test is to script a condition with an appropriate timeout value to wait for your next target to be present before acting on it. However, this solution may not work if your target is present before you can even act on it—for instance, if a button appears but still isn't clickable. In this case, you can script in a sleep command, starting with a generous timeout value (on the order of 3 seconds) and adjusting as necessary based on later test runs. Make sure to set a timeout value that is appropriate for all of the agents you're testing from.
In the example from Chase's login site below, our desired target, the ‘Pay bills' tab, is present but not yet clickable as other elements on the page finish loading. We need to allow for a pause in the script to wait for the spinning wheel to disappear.
Below, find the script that clicks the login button and goes through the necessary wait and sleep conditions to ensure that the target to be clicked, the ‘Pay bills' tab, is present and clickable.
As an alternative solution, you could also set a condition to wait for the modal window (or some other obstacle) to not be present, though this fix may be less reliable.
Switching Windows
Some web pages are quite complex and may require using a more obscure Selenium command. For example, clicking a link or button sometimes opens an entirely different browser window. To continue the workflow and interact with elements in the new window, you'll need to script a command to switch to that window. Sometimes, this switch isn't captured by a typical recorder extension.
To find the ID of the new child window, navigate to the Console in the desired window; the command "window.name" will return the ID of the window you're in. Use this ID as the target of a selectWindow command to navigate to this window. When you're ready to go back to the original parent window, simply script a selectWindow command with target, "null".
The example below from Apigee opens a new window to authorize permissions to use a Twitter account.
The script below selects the child window for authorizing Twitter, types the username and password in the appropriate fields, clicks the "Authorize app" button, and finally selects the original parent window.
Avoiding Bot Detection
Many sophisticated login pages contain authentication mechanisms that try to heuristically detect whether the entity logging in is a human or a bot.
There are a variety of heuristics—one of the more common techniques is to gray out the login button, requiring a certain mouse movement, like a mouseover, to make the login button clickable. In this case, you would need to script a mouse movement to hover over the login button before clicking. Sometimes a different combination of commands is needed: hover and click, focus and click, and two successive clicks with a two- or three-second pause in between are some common sequences that may work.
On Microsoft's login page, you'll notice that the login button lights up if you hover over it. This suggests that some sort of mouse action is required before the button is actually clickable.
We found that scripting two successive clicks doesn't successfully click the button, likely because too little time elapses between each click. However, scripting two clicks interspersed with a three-second pause does work.
Another heuristic used by sites is to measure the amount of time elapsed between the page load of a login screen and the point in time when the username and password fields have been filled and the login button has been clicked. A bot would generally click much faster than a human, so login attempts with too-quick form fill outs and clicks are blocked. It's a less popular heuristic because it can potentially block legitimate users if they click too quickly. To avoid being detected with this heuristic, script in a command like a mouseover before clicking, which will almost always have a sufficient delay. If that doesn't work, use a sleep command to wait for a fixed amount of time before clicking.
In most cases, it will be difficult for you to know what heuristics are being used, so it's a good idea to get familiar with the most common heuristics and solve issues using trial and error.
Using the Sleep Command
Often there are better solutions than using sleep commands, which may unnecessarily increase total transaction time. However, there are a few exceptions to the rule—we've already mentioned pausing to wait for modals to disappear and to evade bot detection, and we'll dive into a few more special cases where using pauses is necessary below.
A waitForPageToLoad command will wait only up until the page load event is triggered. However, if components are loading asynchronously, there may still be some objects remaining that haven't yet loaded. Adding a sleep command after the waitForPageToLoad command will allot time toward loading any remaining objects that you may want to script interactions with.
In addition, sites can behave in a variety of other unexpected and idiosyncratic ways that require the use of a sleep command. Page components may load or behave erratically, certain objects may be present but obscured for a period of time—there are a number of anomalous situations in which you may have to resort to using a sleep command. Fortunately, waiting for a fixed period of time is now very easy to script: the command is "sleep", and the value is the length of wait time in milliseconds.
The below script will wait for 3000 ms, or 3 seconds.
Selenium's Limitations
Finally, let's talk about what Selenium scripts can't do. Sometimes web pages are just too complex or unpredictable. Selenium scripts are great at handling static situations, but once you encounter erratic behavior on a web page, it becomes impossible to write a transaction script that won't break at least some of the time. For example, different popups and ads may appear unpredictably only some of the time, perhaps based on geographic location, time or any number of other factors. In cases like this where conditionals are needed, you'll need to bring in JavaScript.
In addition, many authentication technologies are impossible to circumvent with transaction scripts, including two-factor, single sign-on (SSO) and CAPTCHA authentication. One important exception is authentication via validation of IP addresses, an authentication mechanism used by Salesforce. To get around this authentication step, you'll need to whitelist the IP addresses of your Cloud and Enterprise Agents in your Salesforce account. For more detailed instructions, see the Knowledge Base article, Web Transaction Tests for Salesforce.com.
And as a last reminder, you can consult our related Knowledge Base article to see the list of all supported Selenium commands.
Wrapping Up
In this post, we've summarized the most advanced and most commonly used tips from our Customer Success team. You're now well equipped to record and write scripts for some of the most complex web pages and transactions. Don't be afraid to go through many iterations of the trial-and-error process to solve problems in your scripts, and if you get stuck, contact Customer Success by clicking the help menu in the upper right corner. Special thanks to Jay Kothari for imparting all of his transaction script-related wisdom for this post.