We could say automation is the whole raison d’être for software development. As developers, we seek to employ automation in order to solve problems with more efficiency than before. And we solve problems not only for our clients or employers but also for ourselves. We write scripts and software utilities to automate the packaging and deploy of our applications. We employ plugins and other tools that can automatically check our code for common mistakes and even fix some of them.
Another instance of automation is browser automation. And that’s what this post is all about. If the term doesn’t ring a bell, never fear. The post will do justice to its title and answer the question it poses. And after defining the term, we’ll proceed to show scenarios where browser automation is the right tool for the job. Then, to wrap up the article, we’re going to give you tips so you can get started with browser automation ASAP. That’s what, why, and how in just a single post.
Let’s get started.
Browser Automation: Definition
We’ll start by defining browser automation. We could try something like “‘Browser automation’ means to automate the usage of a web browser” and leave it at that. But that would make for a definition that’s both technically correct and useless unless we define automation. That word is one that we often take for granted, so I think it might be useful to actually take a step back and define it.
Defining Automation
Here’s what Merriam Webster has to say about automation:
1: the technique of making an apparatus, a process, or a system operate automatically
2: the state of being operated automatically
3: automatically controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human labor
Interesting. Now take a look at Wikipedia’s definition:
Automation is the technology by which a process or procedure is performed with minimum human assistance
Bots Don’t Get Bored
What do those two definitions have in common? At least for me, the point that’s really obvious is that automation seeks to remove human intervention from the equation. And why would we want to do that? While we humans are great at a lot of things, we’re also terrible at a lot of things—especially tasks of a repetitive nature. When performing repetitive, boring tasks, we tend to get…well, bored. Our mind easily zooms out of focus as we enter autopilot mode, and soon we’re making mistakes.
But since we’re a pretty smart species, we came up with a device that’s way better—and faster—than we are at performing repetitive tasks. And of course, you know I’m talking about the computer. With all of that in mind, here comes my upgraded definition for browser automation:
Browser automation is the process of automatically performing operations on a web browser, in order to achieve speed and efficiency levels that wouldn’t be possible with human intervention.
It’s far from being a perfect definition, but it’s already something we can work with.
Browser Automation: Scenarios for Usage
Why would someone want to automate the operation of a web browser? As it turns out, there are plenty of use cases for browser automation, and that’s what this section will cover.
Automatic Verification of Broken Links
It’s frustrating to click on a link only to see the infamous “404 Not Found” message. If you have a site, then you should definitely fix the broken links in it or, alternatively, delete them. But before you go about doing that, you first need to find them. This might not prove too much of a problem if your site has just a handful of pages. But think about a complex database-backed portal, with hundreds or even thousands of pages, mostly dynamically generated!
Now, what before was a minor nuisance becomes a herculean task. And that’s where browser automation fits in. You can employ tools that will automatically go through your site, verifying every link and reporting the ones that are broken.
Performance Testing
Performance is a huge concern when talking about software development. In this era of high-speed connections, most users will get frustrated if the site they’re trying to access is even slightly slower than they’d expected. Besides, Google itself penalizes slower sites on its search result pages.
Browser automation can also help with that. It’s possible to employ browser automation tools to do load and performance testing on your websites. This way, you can not only verify your web app’s performance on the average case but also predict its behavior under the stress of traffic that’s higher than usual.
Web Data Extraction
When the World Wide Web was invented 30 years ago, its purpose was to allow researchers to easily propagate their works. In other words, humans put stuff on the web for other humans to consume. In the decades that followed, we watched a rise in the non-human use of the web.
Browser automation definitely plays a part in this. Web data extraction, also known as web scraping, is another use case for browser automation tools. From data mining to content scraping to product price monitoring, the sky is the limit for the uses of web data extraction.
Automated Testing
Last but not least, we have what’s probably the poster child of browser automation use cases: automated testing. Yes, we just talked about performance testing and broken link verification, and those things are also automated tests. But here we’re talking of general, end-to-end functional tests. For instance, you might want to check that, when informing an invalid login and/or password at a login screen, an error message is displayed.
Such tests really shine when you can effectively use them as regression tests. If a fixed problem returns in the feature, you have a safety net that will warn you. And that safety net is way faster and more efficient than human testers—at a fraction of the cost.
Parallel Testing
When you consider the number of existing browsers and operating systems, it becomes clear that testing across all of the possible combinations is an incredibly challenging task. Luckily, browser automation tools like Selenium enable users to perform what we call parallel testing or grid testing.
As the name suggests, this ability consists of running a single test case, simultaneously, across a large number of devices and operating systems. By using grid testing, you can ensure your app behaves as expected across at least the major browsers and platforms, reducing the likelihood of delivering a poor user experience for your users.
Parallel testing is a wonderful capability that would be amazingly unpractical and expensive—if not downright impossible—to pull off without the help of browser automation.
How to Get Started With Browser Automation
Learning browser automation can seem like a daunting task. It’s an enormous topic, and there’s a lot to know. But it’s no different from any other area in tech. Approach it the way you would approach learning a new programming language or framework: by doing it.
First, think of at least one use case for browser automation in your current organization. We’ve just shown you some, and I’m sure you can think of many more. Some people call this “scratching your own itch,” and it’s an effective way of motivating yourself to learn something.
As soon as you have a small, discrete problem you think you can solve with browser automation, starting looking around for tutorials on how to get started with some of the available tools. When you get stuck, look for help in the documentation of the tool you’re trying to use. You can also search for help on Stack Overflow under the “browser automation” tag. And of course, there’s always Google.
Build a minimum viable example of browser automation in place. As soon as you get something that works, no matter how simple it is, that’s a milestone. You can use it as a foundation upon which to build more sophisticated and complex approaches.
Browser Automation and Selenium
As we’ve mentioned before, Selenium is one of the most popular tools when it comes to browser automation. It’s not a silver bullet, though. Besides its many pros—e.g. it’s a widely used tool with a large community—Selenium also has important cons—e.g. it can lead to brittle tests.
Make no mistake, though: Selenium is a valuable tool that any professional who performs software testing should have in their toolbelt. So, we invite you to check out some of our previous posts in which we cover specific aspects of using Selenium:
- How To Wait For a Page To Load In Selenium
- How to Find An Element By Text In Selenium
- Selenium SendKeys: A Detailed Usage Guide With Examples
- How to Take a Screenshot In Selenium
Where to Go From Here?
Today’s post was meant to give you a quick primer on browser automation. We started by defining the term, then proceeded to show some common use cases for the technique. Finally, we gave you tips on how to get started.
As I like to say when writing introductory articles like this one, this was just the tip of the iceberg. There’s much more to browser automation than what could be covered by a single blog post. Where do you go from here then?
There’s no silver bullet: the answer is to keep studying and practicing. Continue to evolve your first minimum test suite and learn from it. You should also keep an eye out for what’s happening in the world. There are interesting developments, such as the use of machine learning to help developers with the creation, running, and maintenance of test cases.
Additionally, stay tuned in this blog for more automation-related content. Thanks for reading and see you next time!
This post was written by Carlos Schults. Carlos is a .NET software developer with experience in both desktop and web development, and he’s now trying his hand at mobile. He has a passion for writing clean and concise code, and he’s interested in practices that help you improve app health, such as code review, automated testing, and continuous build.