Introduction
Testing and quality assurance is critical in all phases of software development lifecycle. As manual tests are slow and require a lot of human effort, software development teams prefer to automate as many tests as they can so they can be executed for each build. This is an important investment, as the developers have to write the tests that verifies every feature they implement.
The main benefit of this automation is the confidence that new versions of the product can be delivered without additional manual tests, and thus more frequently. But what happens when a bug happens in production? Stakeholders confidence can erode:
- Business experts feel like the bug wouldn’t happen if they did the manual tests thoroughly.
- Developers feel like the time invested in test automation is not worth it, as they fail to capture the actual business expectations.
This is where BDD enters into play, as a collaboration tool, aligning the automated tests with domain experts expectations. It clarifies what is tested in a language that all the stakeholders understand. With BDD, domain experts can specify and track what is tested and developers can implement automated tests to protect observable user behavior instead of validating implementation details.
During my developer career, I contributed to projects with different levels of test adoption. I once joined a team where manual testing was the standard, and, as a consequence, no developer wanted his code to be modified by anyone else. Introducing test driven development (TDD) helped a lot, but with the existing code as the single source of documentation, it was not easy to test the right behaviour.
I also worked as a test engineer, and our team mission was to cover as much code as possible with unit tests. Our productivity was measured by test coverage reports (how many new lines of codes are covered by unit tests), so we tested implementation details (mocking dependencies as much as possible) without caring about business value.
In a more recent project, we introduced BDD as a way to write test scenarios in a language that business units can understand, using Cucumber. Developers adopted it quickly, as it was easier for us, while reading the tests, to understand how different variants of business cases were covered (or missed). But yet, some stakeholders finished asking how good our tests were, and why they are still spotting new bugs during their manual tests.
The projects where I felt confidence in continuous software delivery had one thing in common: Well written test scenarios contributed by all parties and automated by the developers.
Let us dive into BDD to understand how it works, common tools that help for its implementation, example use cases and how to overcome some challenges.
What Is Behavior Driven Development (BDD)?
BDD emerged in the early 2000s as a refinement of TDD (test driven development) and ATDD (acceptance test driven development). It was popularized by developers like Dan North and Liz Keogh.
It encourages a conversation between team members (domain experts, product owners, developers, testers, etc..) to agree on the expected system behavior and to describe it in a ubiquitous language. It starts from business intent (What do we want to build? Which features do we want?) and formalizes the system’s response to different situations.
Just using a tool like Cucumber is not enough to pretend you are applying BDD, if the system features and use case scenarios are not specified in collaboration with all the stakeholders involved. It is not not automation-first, but makes it easier for developers to understand what to automate.
BDD is not a testing framework nor a replacement to TDD or agile methods. It is a collaboration and specification discipline used to avoid misunderstandings or different expectations between technical and business teams.
When Should You Use BDD, and Why
BDD adds real value when it is used to build systems implementing complex business rules by a team with technical and non-technical members. Depending on their position, team members usually see the system from different points of view:
- Business experts and products owners focus on what features are needed and why
- Developers focus on how the features should be implemented
- Testers focus on how the system can be broken
Without a written agreement on what is tested and how, mistrust can quickly take over collaboration between team members, opposition can replace team-play.
Developers can suspect business experts and testers to run new test scenarios/variants in order to demonstrate the weakness of their implementation.
Business experts can doubt the developers ability to understand business rules.
Following is a typical conversation:
Business expert: Why can’t an administrator add a stock item in the system?
Developer: It was never mentioned that administrators could add stock items!
Business expert: I thought it was obvious!
With BDD, the team collaborates to define test scenarios and variants in a document that can then be automated by the developers or testers to ensure expected system behavior is properly maintained across releases.
For long-lasting applications that evolve frequently, the written test scenarios will remain as a living documentation that survives teams’ turn over technology shifts.
When it comes to automation, BDD scenarios can be implemented either as unit tests if the business is interested by the correctness of low-level components (for example, when you build a library of classes that will be used by other developers) or integration tests when the features implementation involve multiple components (a web application for example).
Working on a legacy application, we once upgraded an obsolete dependency, tested major features and deployed it. Our end users then quickly noticed that file download features were broken: No one thought of testing them! We had a lot of unit tests so we relied on them, but we missed tests that cover complete user stories.
However, not every test should be a BDD scenario. In the case of a web application for example, developers might add unit tests for specific classes or functions depending on how they designed the solution. BDD scenarios should not be impacted by any design or implementation choice. They should only protect user journeys and business rules.
How BDD Works: Principles and Lifecycle
Doing BDD implies a team preference for collaboration over handoffs. The specification is not the sole responsibility of the business experts, but the result of conversations between all team members. The product is described outside-in, thinking behavior-first with a shared, domain-specific language. This collaboration enforces a shorter feedback loop over the expectations and reduces potential divergence in the way different stakeholders understand the solution.
The lifecycle of a BDD process typically follow the following steps in a loop:
- Discovery: align on intent and examples, describe features and business rules
- Formulation: express behavior as shared specifications using a domain-specific language
- Automation: protect behavior over time by implementing automated tests. This step turns the BDD outcome into guardrails for features implementation and future maintenance (bug fixing, refactoring, library upgrades etc…)
TODO: Illustrate with a diagram
The loop should repeat continuously to accompany the product lifecycle. BDD suits agile processes.
The tools (Cucumber, testing frameworks etc..) are secondary to the central goal of BDD: shared understanding.
How System Behavior Is Described in BDD
BDD expresses requirements through concrete examples rather than abstract business rules. It describes how the users of the software will achieve their goals in different scenarios.
When a software is specified only in general terms, it can hide assumptions. Abstract ideas look clear until you try to apply them. Examples force you to apply the abstract rules, therefore revealing ambiguity and surfacing assumptions.
Good examples are also test scenarios. When written correctly, they can be used to guide manual tests or automated (with tools like cucumber) to protect the system behaviour over time. A common way to outline your scenarios is the Given-When-Then structure: You start by setting up the context (Given), you then describe the steps followed by the user to achieve his goal (When) and finally you assert the expected results (outcome).
BDD is a communication tool first. It is meant to make sure that all the stakeholders are on the same line in every phase of the software lifetime. Overloading scenarios with multiple behaviors or technical details will make you diverge from that goal. Instead, scenarios must be self-contained, with only the necessary context and focus on observable outcomes.
How to Implement BDD Testing in Real Teams
Implementing BDD can be hard if all the team members do not agree to share the same language. Developers can find it too abstract if it doesn’t rely on concrete examples. Business experts can feel uncomfortable if you use technical words. Using vocabulary and grammar that everyone can speak is key. Following are a few steps that can help you put it in place.
Step 1: Identify and Agree on Desired Behavior
Frame work in terms of observable user outcomes. Avoid jumping straight to technical solutions.
Step 2: Write Clear User Stories and Acceptance Criteria
User stories provide the context and intent of a feature. They answer the “what” and “why” questions. They are accompanied by acceptance criteria that provide testable behavior, telling how a good implementation of the user story can be verified.
Avoid writing technical tasks as stories, as this might exclude business people. Also avoid vague or unstable acceptance criteria, as they might rely on business experts assumptions that probably will not match developers’ understanding.
Example user story: As an administrator, I want to create a customer’s account so they can login and use our platform.
Acceptance criteria: The customer’s data (First name, last name, email) is saved. The customer receives a registration link by email.
Step 3: Review and Refine Scenarios Collaboratively
Collaboration is very important here. No matter who wrote the user story and acceptance criteria, a cross-functional team should meet in a conversation to clarify and refine test scenarios. Different variants (both successful and unsuccessful) should be stated here.
As an example, the above user story can be refined as with the following scenarios. We use the given-when-then structure here:
Feature: Account creationScenario: Successful creation of a customer’s account Given I am an administrator of the platform When I submit a new customer’s account with first name, last name and email Then The new account should be saved And The customer should receive a registration link by email Scenario: Failure to create a customer with duplicate email Given I am an administrator of the platform And A customer’s account was already created When I submit a new customer’s account with first name, last name and same email as existing customer Then The request should be rejected And The new account should not be saved And No email should be sent
Step 4: Automate Tests for High-Value Behaviors
Once the team agrees on the scenarios, developers can proceed with automation. The automated tests should accurately cover the scenarios that protect critical user journeys and business rules. Depending on the team’s workload and the deadlines, the time to invest in test automation can be limited. That is why it is important to prioritize (in collaboration with the business team) high-risk flows over low-value scenarios.
Avoid UI-coupled and brittle tests. You are not testing how features are implemented. The BDD automated tests should survive code refactorings and UI redesigns to make sure user journeys are not broken. In other words, your tests should not fail if the implementation changes. They should only fail if the wrong behaviour is implemented.
Each test scenario should be executed in isolation, without any dependency on another scenario. This allows you to choose which tests run in different configurations (light tests on every commit and long running, resource consuming tests in nightly builds), optimize your builds with parallel tests execution. It also simplifies issues diagnosis and fixing.
Process Recommendation: Integrate BDD into Agile and CI/CD
BDD tests scenarios can be used as the definition of done (DoD) for agile teams. Thereby, they will be maintained as a transparent, shared agreement on the completion criteria. Backlog items should all be assigned relevant BDD scenarios as acceptance criteria.
During backlog refinement, BDD scenarios improve the alignment of the team on what needs to be done for each backlog item and therefore makes it easier to agree on the priorities.
To confidently ship features and bugfixes as soon as they are implemented, behaviour-focused tests should be integrated to CI/CD pipelines, removing the need of manual tests before each release.
BDD Tools and Frameworks
Using the right tool can smooth the automation of BDD scenarios as unit or integration tests. Many options are available from which the developers or testers can choose based on the language ecosystem, team’s experience, and CI/CD compatibility.
As a reminder, tools are used to support BDD, not to define it. In other words, just using a BDD framework doesn’t mean that you are doing BDD.
Following are some popular open source frameworks used for BDD scenarios automation:
Cucumber
With cucumber, you can write your BDD scenarios in plain english or any supported language using the Gherkin syntax. Libraries are available to implement each step of the scenarios for the most popular platforms like the JVM (Java virtual machine), Nodejs, Ruby, .Net and many more.
Behave
Behave is a python framework that can be used to automate BDD scenarios. It also supports the Gherkin syntax.
JBehave
JBehave is a java testing framework for BDD that can automate scenarios written in plain english with a custom syntax (similar to Gherkin).
RSpec
RSpec is a Ruby tinting framework that can be used to write tests in a BDD style. It can also be combined with Cubumber to separate scenarios definition from their implementation.
Real-world BDD Testing Examples and Lessons
BDD for Common Userfacing Features
To illustrate how even simple features benefit from clear behavioral definitions, let us consider a very common use case in social media platforms: Liking a post.
It can seem straightforward with a simple user story:
As a user, I want to like a post so that I can show appreciation
But different people in the team might have conflicting assumptions
- The developers might assume that clicking the “Like” button will increase the likes counter for the post
- The testers might assume that the “Like” button should be clickable only once.
- The product owner might assume that clicking the “Like” button a second time should revert the first action
Such divergency of assumptions can lead to multiple iterations for a simple feature.
When applying BDD, the team can start by agreeing on clear acceptance criteria. For example:
- User can like a post by clicking the “Like” button
- The like should be persisted
- A user can only like a post once
- Clicking again removes the like (toggle behavior)
Scenario definitions can then be written to clarify the behaviour. The following example is written with the Gherkin syntax:
Feature: Like a post As a user I want to like a post So that I can show appreciation Scenario: User likes a post successfully Given a post is displayed And the user has not liked the post When the user clicks the "Like" button Then the post should be marked as liked And the like should be persisted Scenario: User removes their like from a post Given a post is displayed And the user has already liked the post When the user clicks the "Like" button again Then the post should be marked as not liked And the like removal should be persisted
The team can add more acceptance criteria and scenarios if needed to mitigate ambiguity and prevent costly reworks.
Full-Stack BDD Scenarios
Automating BDD scenarios can be limited when the developers are restricted to a single layer of the stack (frontend or backend). Consider that the application is implemented with a web frontend (HTML and Javascript) that connects to a backend (Java, Spring Boot) through a REST API. The frontend developer/tester can assert that it sends the expected data to the backend and displays the response properly. The backend developer on the other hand can only assert that its service processes the request as expected (including side effects like API calls and data persistence) and returns the right output.
In the case of an access restriction scenario for example, the backend test will just assert that the API returns a 401 HTTP code while the frontend test will assert that an appropriate message is displayed when that output is returned.
It might seem that both tests are enough to protect the user behaviour, but the frontend test relies heavily on mocking and thus validates the implementation of the system more than the expected behaviour. Without broader validation, integration risks might surface late, either because the contract between front end and back end was misunderstood or because any party can introduce a breaking change that can cause an incompatibility between the layers of the stack.
The good news is that with modern tooling, developers can automate the BDD scenarios end to end with reasonable effort, covering the full stack with the same test automation. I enjoyed working with the following tools:
- Cucumber for behavior specifications
- Playwright for browser-level validation
- Spring Mock MVC for backend integration
- TestContainers to validate persistence against a real database
- Wiremock to simulate/validate external API calls
It goes without saying that full-stack scenarios are complementary to—not replacements for—unit tests. Always look for opportunities to cover complex business logic with unit tests.
Common Challenges and Misconceptions in BDD
Applying BDD can become difficult if the team doesn’t align properly on the process and the expected outcomes. Following are some known challenges or misconceptions and how you can mitigate them.
Over-mocking and brittle tests
Over-mocking in your test automation code can be a sign that you are testing implementation details instead of business value. You might be asserting how components are integrated instead of what the system is expected to do. Such tests not only fail to validate the behavior, but they also make it hard to refactor your code.
Prefer realistic end-to-end tests when validating the behavior. Such tests can better automate the BDD scenarios and become your safety net for internal system changes and refactorings.
Coverage-driven testing
Coverage rate can be a good indicator of how many lines of codes of an application are covered by tests, but it doesn’t tell how well the application is actually tested. Having test coverage as a target can lead developers to practices like over-mocking and brittle tests.
When applying BDD, the developers should cover as many scenarios as possible, starting with business-critical flows. They should focus on protecting more business value instead of covering more lines of code.
Unrealistic stakeholder expectations or poor communication
Tests (whether automated or manual) can reduce risk but cannot guarantee zero bugs. It has to be clarified, otherwise some stakeholders can start distancing themselves from the process as new bugs are discovered.
BDD helps with proper bug fixing, as the team can define scenarios to reproduce issues before the developers try to fix them. With the automated scenarios in place, the team can continuously deploy new versions of the application without fear of regression.
With BDD, bugs are not the sole responsibility of developers. When applied well, it improves feedback loops and shared trust.
Making BDD Work Beyond Theory
BDD is not about writing more tests, it is about validating the right behavior. It cannot be limited to the usage of a tool like Cucumber, but requires a continuous conversation between stakeholders to maintain a description of the system’s behavior without assumptions. That behavior becomes the contract that survives refactoring, system changes and staff turnover.
You know BDD works when teams stop asking “Did we test the code?” and start asking “Did we validate the user’s behavior?”
Leave a comment