Selenium API and Architecture – All Details with Diagrams and MindMaps

Selenium API is a critical part of the Selenium Webdriver Test Automation. Selenium test automation comprises four basic concepts these are Selenium Navigation, Selenium Find Elements, Selenium Actions, and Selenium Wait.

Formerly, it was not categorized for Webdriver’s predecessor Selenium RC but for Webdriver, it is categorized based on those categories. In this post, I will explain the Selenium API and its categories. Let’s get started!

Selenium 4 Architecture 

Selenium 4 comes as a suite and it comprises 3 important parts these are:

  • Selenium IDE: This is a “record and play” tool for debugging your tests and creating some small-size test automation suites.
  • Selenium Webdriver: It is an automation API of the Selenium project. By using the webdriver object we can automate web applications.
  • Selenium Grid: If you want to run your tests in parallel with several browser types then Grid is the tool that you need to use.

And this is the selenium project’s maven repository that you can use in your JAVA projects: https://mvnrepository.com/artifact/org.seleniumhq.selenium 

Let’s learn the new features of these important sub-projects. 

Selenium IDE

Selenium IDE has been re-implemented by Dave Haeffner who is the lead developer of the project and this new version, they called this tool Selenium IDE TNG which stands for “The Next Generation”.

The old Selenium IDE was working only with Firefox but the new one has Chrome and Firefox extensions. Hurray! :) The new Selenium IDE has been developed as an Electron app and it supports to use of Debugging protocol.

We can add extra powers to Selenium IDE by extending it with some awesome plugins such as Code Export, Control Flows, Backup Element Selectors, etc. Also, you can export the generated code and use them in your projects.

The other feature of the Selenium IDE is to support conditional statements such as if, else, etc. You can also use loops to do for loop iterations or conditional while loops such as while, times, forEach, etc. For more information, I suggest you check the official page of Selenium IDE.

Selenium Grid

Selenium grid helps us to scale our test automation runs across multiple browsers and operating systems. We declare the browser types, operating systems, and the tool that distributes our tests with respect to our requirements.

Selenium 4 also comes with a brand-new Selenium Grid which supports new technologies such as containerization (docker), container orchestration (Kubernetes), etc. As a very brief summary, docker is a container engine and by this, we can run the applications in containers, and Kubernetes orchestrates these containers. The new architecture of the Selenium Grid consists of 4 processes. These are:

  • Router: It watches new session requests.
  • Distributor: Distributes tests to nodes.
  • Session Map: Maps session Id with respective node.
  • Node: It is the place that we run our test scripts.

Before we have hub and nodes as you may know. ;) New selenium grid operates like this:

First, the message comes to the router and then ping the grid. After this, the session moves its journey to the distributor which knows all nodes. Then distributor selects a node and runs our test in the node. Node replies back to the distributor with a URL of the session app and returns control to the user. In selenium 2, the grid’s hub architecture contains (Router, Distributor, and Session Map) but these are now separated processes. Also, we can debug the failures easier with trace IDs in requests.

Selenium Webdriver

The main milestone of the Selenium 4 project is webdriver is a part of the W3C complaint. The JSON wire protocol and W3C protocols in a high level look similar as follows: 

JSON Wire architecture

Webdriver W3C Architecture:


Copyright Note: Images can be used by referencing the swtestacademy website. -Onur Baskirt

As we see on the architectural view of Selenium 2 and Selenium 4, JSON Wire protocol has been removed in W3C architecture. That means information is not transferred via HTTP by sending HTTP requests and getting HTTP responses. In this new architecture, information is transferred directly between client and server. In this way, tests will run faster, more consistently, and more compatible between browser types.

For other Selenium 4 features, you can check our specific articles.

https://www.swtestacademy.com/selenium-4-sample-codes-for-new-features/

https://www.swtestacademy.com/selenium-4-grid-standalone-tutorial/

https://www.swtestacademy.com/selenium-relative-locators/

https://www.swtestacademy.com/selenium-find-element/

Selenium 2 Webdriver API 

Reference link: https://www.mindmeister.com/280141421/selenium-2-webdriver-commands

In the above link, you see Selenium Webdriver API and its commands at a glance. This is a perfect mind map for testers who are using Webdriver for web automation projects. In the below picture, you see that selenium webdriver commands are divided into five main categories.

selenium api

Selenium Navigate to URL

Selenium Navigation means to open a browser, move to a page from another one, back, forward, etc.

selenium-webdriver-api

Selenium Find Element

Selenium Find Element or in other terms Interrogation means get information about the website and its elements. For example, read the page title, read the URL, get a text, get options, find the element’s location, get the element’s size, etc.

selenium-api

Selenium Actions

Selenium Actions or in other terms, Manipulation means that clicking on links and buttons, filling forms, cleaning texts, pressing keys, drag & dropping etc.

selenium-wait

Selenium Wait

Selenium Webdriver Wait is important for the synchronization of our test automation projects. While we are writing our test automation codes synchronization is one of the most important parts of our automation task. We have to manage the automation speed, wait for the web application events, etc. It is so important to use timeouts. In this way, we provide a much more solid and reliable web test automation.

webdriver-wait

Selenium Frames | Selenium Windows | Selenium Alerts

The domain part is about changing frames & windows, managing alerts, and cookies, selecting drivers for browsers.

selenium-windows

Note: Detailed mind-map is shown at https://www.mindmeister.com/280141421/selenium-2-webdriver-commands

Many thanks to Alan Richardson for this beautiful mind-map.

Thanks for reading.
Onur Baskirt

4 thoughts on “Selenium API and Architecture – All Details with Diagrams and MindMaps”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.