Skip to content
Property-Based Testing in C#: Breaking Your Assumptions with FsCheck and xUnit

Property-Based Testing in C#: Breaking Your Assumptions with FsCheck and xUnit

1 The Paradigm Shift: From verifying Examples to Enforcing Laws

Most teams rely heavily on example-based tests. They write a few “happy path” tests, add some edge cases they can think of, and assume the important scenarios are covered. But software rarely breaks on the inputs we expect. It usually breaks when multiple variables line up in a way no one anticipated.

Property-Based Testing (PBT) shifts the focus from verifying specific examples to defining rules—or “laws”—that must hold for every valid input. Instead of asking “Does this pass for these three test cases?” we ask “What should always be true?”

FsCheck brings this style of testing directly into C# and xUnit. It gives developers the power of mathematical properties without forcing them to change their workflows or tools.

1.1 The Limitation of “Happy Path” and “Known Edge Case” Testing

1.1.1 Why human imagination is the bottleneck in unit testing.

Traditional unit tests rely on developers to come up with inputs. Even experienced teams tend to choose predictable values: small integers, obvious strings, one or two null cases, and a couple of “strange” inputs just to be safe. These cases come from habit, not from exploring the full range of possible values.

But real systems operate on far more combinations than developers can imagine. A method that takes two int values technically accepts billions of combinations. A datetime parser may behave differently depending on culture, format, and locale. And any function that deals with user input can encounter thousands of variations that never appear in example tests.

No matter how thorough the team is, example-based testing only covers places the human mind knows to look. That limitation becomes the biggest bottleneck.

1.1.2 The concept of “The Black Swan”: Handling inputs you never anticipated.

A “black swan” failure is the kind of bug that only shows up in production—usually because the system processed an input pattern nobody saw coming. These failures are expensive. They trigger incident calls, emergency patches, and awkward explanations to customers.

PBT helps catch these cases early. Instead of trying to imagine every corner case, we describe the rule the system must follow. FsCheck then generates a wide variety of inputs, including values developers would never manually write:

  • Unusual Unicode strings
  • Zero and negative values
  • Massive values
  • Misordered dates
  • Strange combinations of state

By offloading input creativity to FsCheck, developers focus on defining correctness instead of generating test data.

1.2 Defining Property-Based Testing (PBT)

1.2.1 Mapping inputs (x) to output properties (f(x)) rather than specific results.

Property-based testing treats your code like a function: for any valid input x, certain things must always hold about f(x). A property is not a single expected value—it’s a general rule.

Common examples include:

  • Sorting a list should always produce an ordered list.
  • Reversing a sequence twice should return the original.
  • Adding items to a cart should never decrease the total.

For example:

[Property]
public bool ReverseTwiceReturnsOriginal(string s)
    => string.Concat(s.Reverse().Reverse()) == s;

FsCheck feeds this property hundreds of strings—short, long, empty, invalid, Unicode-heavy—to verify the rule holds under a wide range of inputs. This is far more coverage than you get from a handful of manually chosen examples.

1.2.2 The core components: Generators, Properties, and Shrinkers.

Every PBT framework—including FsCheck—centers around three key components:

  1. Generators They produce random but structured inputs. FsCheck knows how to build primitive types, collections, and custom domain objects.

  2. Properties These are rules your code must satisfy. FsCheck evaluates each property many times with different data.

  3. Shrinkers When a failure occurs, FsCheck doesn’t just report the failing input—it reduces it to the smallest possible input that still causes the failure.

    If a function fails on a huge 10,000-character string, FsCheck may shrink it down to:

    • "*" or
    • "a" or even
    • ""

    Having the minimal failing case makes debugging dramatically easier.

1.3 The Business Case for PBT in Enterprise Applications

1.3.1 ROI analysis: High initial setup cost vs. long-term bug prevention.

PBT requires developers to think differently: identify invariants, express rules carefully, and trust generated inputs. This takes some practice. But once a team gets comfortable, the benefits are significant:

  • Fewer production issues caused by unexpected input combinations.
  • More stable tests—properties don’t break often during refactoring.
  • Broader test coverage without writing dozens of individual test cases.
  • More confidence when modifying core business logic.

Complex, rule-heavy domains—billing, policy evaluation, discount engines, financial calculations—gain the most. These systems evolve frequently, and example tests quickly become stale. Properties capture the rules, not the symptoms, so they remain useful as the code evolves.

1.3.2 How PBT acts as executable documentation for Domain Invariants.

Enterprise domains rely heavily on invariants—business rules that must always hold. Examples include:

  • Discounts must not exceed the subtotal.
  • Credit scores must stay within defined limits.
  • Time periods must never overlap.
  • Operations like merging or calculating totals must behave predictably.

Embedding these rules as properties turns them into executable documentation. The rules live next to the code, remain accurate, and self-validate through automated testing. When implementation changes, the properties ensure you don’t accidentally break business expectations.

1.4 The Tooling Landscape in .NET

1.4.1 FsCheck: The mature industry standard (Port of Haskell’s QuickCheck).

FsCheck is the most widely used PBT framework in .NET. It’s inspired by Haskell’s QuickCheck but designed to fit naturally into C# and F#. It offers:

  • Clean integration with xUnit, NUnit, and MSTest
  • A rich library of generators
  • Advanced shrinking logic
  • First-class support for domain-specific Arbitraries
  • Model-Based Testing for stateful systems

It’s written in F#, but works smoothly in C# projects.

1.4.2 Hedgehog: The modern alternative (Introduction to range-based shrinking).

Hedgehog takes a more modern approach by combining generation and shrinking in a single mechanism called range-based shrinking. This gives:

  • More predictable shrinking
  • Better results when generating recursive structures
  • More control over size and complexity

The tradeoff is smaller ecosystem support and less mature test framework integration compared to FsCheck.

1.4.3 Why we are choosing FsCheck v3 with xUnit for this guide.

FsCheck v3 remains the most practical option for C#. It’s mature, stable, well-supported, and works naturally with xUnit’s attribute-based style. For most teams, it offers the right balance of power, predictability, and ease of use.

For that reason, all examples in this guide use FsCheck v3 paired with xUnit.


2 Foundation and Configuration: Integrating FsCheck with xUnit

FsCheck fits naturally into xUnit because it uses the same attribute-driven style developers are already familiar with. Once set up, a property test runs just like any other test—except it runs many times with automatically generated inputs. xUnit’s runner then displays the generated values, shrinking steps, and replay seeds when something fails. This section walks through what you need to install, how properties are discovered, and how to understand the output.

2.1 Project Setup and NuGet Dependencies

2.1.1 FsCheck.Xunit: The bridge between the runner and the library.

To use FsCheck inside an xUnit project, add the integration package:

dotnet add package FsCheck.Xunit

This package gives you:

  • The [Property] attribute, which marks a test as a property-based test.
  • Automatic discovery by xUnit, just like [Fact].
  • Output integration so seeds, generated inputs, and shrink results show up in your test logs.

FsCheck itself comes bundled as a dependency, so there’s nothing extra to configure.

2.1.2 Configuring the Test Runner for determinism (Seeds).

Because property-based tests generate random inputs, reproducing a rare failure can be difficult—unless you use seeds. FsCheck prints a seed whenever a test fails:

REPLAY: 12, 987654321

These two numbers represent the exact state of the random generator for that test run. To reproduce the failure, copy the seed into your property:

[Property(MaxTest = 200, Replay = "12,987654321")]
public void MyProperty(int x) { ... }

Now the property will run deterministically and fail the same way every time. This is especially important in CI/CD, where you want failures to be repeatable rather than mysterious one-offs.

2.2 Anatomy of a Property Test

2.2.1 Replacing [Fact] with [Property].

A property test looks almost identical to a normal xUnit test. Here’s a simple example:

public class SampleProperties
{
    [Property]
    public bool AdditionIsCommutative(int a, int b)
        => a + b == b + a;
}

The only real difference is the attribute. Instead of running once with fixed inputs, this test runs many times with random values for a and b.

2.2.2 Understanding the default configuration (100 runs, default timeouts).

Out of the box, FsCheck:

  • Runs each property 100 times with different inputs.
  • Shrinks failing inputs automatically.
  • Uses sensible defaults for size and complexity.
  • Applies timeouts to avoid runaway tests.

If you need more or fewer runs, you can tune the configuration:

[Property(MaxTest = 500, EndSize = 200)]
public bool CustomConfig(string s)
    => s == s;

Most of the time, the defaults work well, and you only adjust these settings when a property becomes expensive.

2.3 The “Hello World” of Invariants (The String Reversal Example)

2.3.1 Proving Reverse(Reverse(s)) == s.

A great first property to learn with is the double-reverse rule:

[Property]
public bool DoubleReverseReturnsOriginal(string s)
{
    var twice = new string(s.Reverse().Reverse().ToArray());
    return twice == s;
}

FsCheck throws a wide range of strings at this—from empty and short strings to long Unicode-heavy inputs. If every one of them respects the invariant, the property passes.

2.3.2 Interpreting the Test Output: Pass, Fail, and Arguments.

When the property holds for every generated value, FsCheck prints:

Ok, passed 100 tests.

If a failure occurs, the output becomes more detailed. You’ll see:

  • The failing argument (after shrinking).
  • How many tests ran before the failure.
  • How many shrinking steps were applied.
  • A replay seed.

A typical failure might look like:

Falsifiable, after 4 tests (3 shrinks) (s = "a\0")

This tells you everything you need to reproduce the issue locally and debug it quickly.

2.4 Understanding “Shrinking” (The Killer Feature)

2.4.1 How FsCheck reduces a 10,000-character failure string to the minimal 2-character reproduction case.

Shrinking is one of the most useful features of FsCheck. When something fails, FsCheck doesn’t stop there—it keeps simplifying the input until it finds the smallest version that still breaks your code.

For example, imagine your function only fails when given a huge input like:

"aaaaaaaa...aaaaβ"

FsCheck will keep simplifying the string:

  1. Shortening it.
  2. Replacing characters with simpler ones.
  3. Removing patterns.
  4. Minimizing structure.

Eventually, the failing case might shrink down to:

"β"

or even:

"\u0000"

This minimal counterexample is usually far easier to diagnose than the original monstrously long input.

2.4.2 Visualizing the shrinking tree.

You can think of shrinking as exploring a tree of simpler inputs. Each “branch” is a simpler version of the previous input. A toy example:

"abcXYZ123"
 ├─ "XYZ123"
 ├─ "123"
 ├─ "3"
 └─ ""

FsCheck walks this tree and stops as soon as it finds the smallest input that still reproduces the failure. You don’t see the full tree, but understanding that this process is happening helps you reason about how FsCheck finds such precise failing cases.


3 Mastering Generators: Modeling Complex Data Structures

Generators are the core of property-based testing. They decide what kind of inputs your properties evaluate against. The default generators for numbers, strings, and simple collections work well for small utilities, but most real systems operate on richer, structured types. If you want meaningful tests, your generators need to reflect the shape and rules of your domain. FsCheck’s Gen and Arb APIs give you full control over how domain objects are created, ensuring your properties run against realistic data instead of loosely shaped primitives.

3.1 Beyond Primitives: Why int and string are not enough

3.1.1 The problem with “Primitives Obsession” in PBT.

If you only generate primitive values—int, string, bool—your tests rarely match the complexity of the domain you’re trying to verify. Real applications use types like:

  • EmailAddress
  • Money
  • UserId
  • CartLineItem
  • ValidDateRange

These types encode rules: an email must include @, money must be non-negative, quantities must be positive, and date ranges must have a valid start–end relationship. Testing domain logic with raw primitives ignores these constraints and leads to unrealistic inputs that don’t exercise the real shape of the system.

FsCheck encourages you to generate full domain objects, not primitive parts of them. This gives you better coverage and removes the noise created by invalid or meaningless test values.

3.2 Custom Generators with Gen

3.2.1 Gen.Choose, Gen.Elements, and Gen.Constant.

The Gen module gives you direct control over how values are built. Some of the most useful building blocks are:

  • Gen.Choose(min, max) — generates integers within a specific range.
  • Gen.Elements(...) — randomly selects a value from a predefined set.
  • Gen.Constant(x) — always yields the same value.

Here’s a simple example that creates something resembling an email:

public static Gen<string> EmailGen =>
    from user in Gen.Choose(3, 10).Select(n => RandomString(n))
    from domain in Gen.Elements("example.com", "test.net")
    select $"{user}@{domain}";

This is already a step up from random strings. You produce values that look like user-generated email addresses while still keeping things unpredictable.

3.2.2 Composition: Using LINQ (Select/SelectMany) to chain generators.

Generators are designed to compose. LINQ query expressions let you build complex generators in a way that feels familiar to C# developers. For example, suppose you want to create realistic OrderLine values:

public static Gen<OrderLine> OrderLineGen =>
    from id in Gen.Choose(1, 9999)
    from qty in Gen.Choose(1, 100)
    from price in Gen.Choose(1, 500)
    select new OrderLine(id, qty, price);

This reads like describing how to construct the domain object, which means the generator mirrors the actual business structure of your data.

3.3 Arbitraries (Arb): Registering Generators globally

3.3.1 Creating specific Arbs for Domain Types (e.g., Arb).

An Arb<T> wraps a generator and a shrinker for a given type. This lets FsCheck know how to both construct and shrink custom domain values.

For example:

public static class EmailArb
{
    public static Arbitrary<EmailAddress> Email() =>
        Arb.From(EmailGen.Select(x => new EmailAddress(x)));
}

Once this arbitrary is registered, FsCheck automatically uses it whenever a property requires an EmailAddress. This ensures consistent, domain-valid input throughout your test suite.

3.3.2 Registering classes using [Properties(Arbitrary = typeof(MyArbs))].

To make Arbitraries available to a test class, annotate it like this:

[Properties(Arbitrary = new[] { typeof(MyArbs) })]
public class EmailTests
{
    [Property]
    public bool EmailIsValid(EmailAddress email)
        => email.IsValid();
}

This integrates smoothly with xUnit and keeps your property tests expressive without drowning them in boilerplate generator code.

3.4 Conditional Generation vs. Filtering

3.4.1 The performance trap of Gen.Where (Discarding inputs).

Gen.Where filters values after they are generated:

var validAges = Arb.Generate<int>().Where(age => age >= 18);

This seems convenient, but it usually harms performance. If only a small percentage of values pass the filter, FsCheck wastes most of what it generates. In extreme cases, it may struggle to produce enough valid values and slow down the test significantly.

3.4.2 Constructive generation: Building valid data by design rather than filtering invalid data.

A better pattern is to build validity into the generator itself. Instead of creating all integers and filtering out the invalid ones, generate only valid values:

var validAges = Gen.Choose(18, 120);

Constructive generation produces cleaner tests, avoids unnecessary waste, and keeps your properties running fast—even when they rely on complex domain types.

3.5 Generating Recursive Data Types

3.5.1 Strategies for generating Trees, Graphs, and composite JSON objects without stack overflows.

Generating recursive structures is trickier. If you aren’t careful, you’ll end up with infinite recursion or massive nested objects. FsCheck prevents this by using size parameters, which you can control with the Sized combinator.

For example, here’s how you can safely generate a tree:

public static Gen<Tree<T>> TreeGen<T>(Gen<T> valueGen)
{
    return Gen.Sized(size =>
        size == 0
            ? Gen.Constant(new Leaf<T>())
            : from v in valueGen
              from children in Gen.ListOf(TreeGen(valueGen).Resize(size / 2))
              select new Node<T>(v, children));
}

Patterns to remember:

  • Always use Sized to control recursion depth.
  • Reduce the size parameter on recursive calls.
  • Prefer multiple small children over one extremely large structure.

These techniques produce realistic nested shapes—trees, JSON-like objects, or hierarchical domain models—without blowing up the stack.


4 Real-World Implementation: Testing Domain Logic and Business Invariants

Property-based testing becomes most valuable when teams apply it to real business logic—especially in places where calculations, rules, and conditional flows interact in subtle ways. Enterprise systems often accumulate assumptions that no one notices until they fail in production. This section takes the concepts from earlier sections and applies them to a realistic example: an e-commerce discount engine. It shows how PBT exposes hidden assumptions, weak spots, and edge cases that are difficult to catch with traditional tests.

4.1 Case Study: An E-Commerce Discount Engine

4.1.1 Setup: A complex class calculating cart totals with tiered discounts and tax.

Imagine a discount engine that applies tiered discounts, optional promo codes, and tax. Even a simplified version contains logic that can easily break under unusual input. For this example, assume these rules:

  • Subtotal ≥ 100 → 10% discount
  • Subtotal ≥ 250 → 15% discount
  • A valid discount code gives an additional 5% off
  • Tax is applied after all discounts
  • All totals use decimal for accuracy

Here’s a clean, minimal implementation:

public class DiscountEngine
{
    public decimal CalculateTotal(IReadOnlyList<CartItem> items, string discountCode)
    {
        decimal subtotal = items.Sum(i => i.Quantity * i.Price);

        decimal tierDiscount =
            subtotal >= 250 ? 0.15m :
            subtotal >= 100 ? 0.10m : 0m;

        decimal codeDiscount = discountCode == "SAVE5" ? 0.05m : 0m;

        decimal discounted = subtotal * (1 - tierDiscount - codeDiscount);

        if (discounted < 0) discounted = 0;

        decimal tax = discounted * 0.07m;

        return discounted + tax;
    }
}

public record CartItem(int Quantity, decimal Price);

Even though this code is simple, it hides assumptions: item order shouldn’t matter, totals should never be negative, and applying the same discount twice shouldn’t produce a different result. These are exactly the kinds of rules PBT helps validate.

4.2 Identifying the Properties (The “Hard Part”)

Finding the right properties is always the most challenging part of PBT. A good property expresses a rule the business expects to hold for all valid inputs. It shouldn’t describe how the method works internally—it should describe what must always remain true no matter how the implementation changes. For this discount engine, several strong properties emerge.

4.2.1 Idempotence: Applying a discount code twice shouldn’t change the price (f(f(x)) = f(x)).

If you apply the same discount logic twice, the result should not change. Nothing about the discount rules suggests they should compound. This makes idempotence a natural property:

[Property]
public bool TotalIsIdempotent(List<CartItem> items, string code)
{
    var engine = new DiscountEngine();

    var first = engine.CalculateTotal(items, code);
    var second = engine.CalculateTotal(items, code);

    return first == second;
}

This helps catch issues like rounding drift or code paths that accidentally apply bonuses more than once. If the implementation accidentally compounds discounts, shrinking will quickly reveal a tiny cart that reproduces the error.

4.2.2 Commutativity: The order of items in the cart shouldn’t change the total (a + b = b + a).

The total cost of a cart should depend on the items—not the order they’re listed in. If reordering items changes the result, the logic is too sensitive to input ordering. A simple shuffle-based property exposes this:

[Property]
public bool OrderDoesNotMatter(List<CartItem> items, string code)
{
    var engine = new DiscountEngine();
    var shuffled = items.OrderBy(_ => Guid.NewGuid()).ToList();

    var originalTotal = engine.CalculateTotal(items, code);
    var shuffledTotal = engine.CalculateTotal(shuffled, code);

    return originalTotal == shuffledTotal;
}

FsCheck will explore carts of all shapes—tiny, large, extreme prices, duplicate items, and combinations you wouldn’t think to test manually. If order matters for only certain permutations, the shrinker will reveal the minimal failing case.

4.2.3 Invariants: “Total Price can never be negative” and “Tax is always ≥ 0”.

Invariants reflect rules that must never be violated. In e-commerce, neither total nor tax should ever be negative:

[Property]
public bool TotalAndTaxAreNonNegative(List<CartItem> items, string code)
{
    var engine = new DiscountEngine();
    var total = engine.CalculateTotal(items, code);

    return total >= 0m;
}

If the system ever emits negative totals—due to extreme discounts, overflow, or bad generator data—this property will catch it quickly. Invariants act like safety rails around the domain.

4.2.4 Round-Tripping: Serialization testing (Object → JSON → Object).

Carts are often serialized when moving between services or layers—web APIs, message queues, caching layers, and database writes. If serialization breaks data shape or precision, subtle bugs occur. A round-trip property ensures consistency:

[Property]
public bool CartJsonRoundTrips(List<CartItem> items)
{
    var json = JsonSerializer.Serialize(items);
    var deserialized = JsonSerializer.Deserialize<List<CartItem>>(json);
    return items.SequenceEqual(deserialized);
}

This catches missing fields, incompatible JSON converters, and numeric precision issues that are easy to overlook in example tests.

4.3 Handling “Hard-to-Test” Edge Cases

Some areas of business logic are notorious for tricky edge cases. Time, dates, floating-point math, and ranges behave differently across cultures and time zones. Property tests help flush out behaviors that only appear under rare combinations of input.

4.3.1 Testing DateRanges and Overlaps (The Time Zone nightmare).

Date ranges hide complexity because time zones, DST transitions, and boundary rules all affect comparison logic. Consider a simple overlap rule:

public static bool Overlaps(DateRange a, DateRange b) =>
    a.Start < b.End && b.Start < a.End;

We can express a symmetry property:

[Property]
public bool OverlapIsSymmetric(DateRange a, DateRange b)
    => Overlaps(a, b) == Overlaps(b, a);

To make generators reasonable:

public static Gen<DateRange> DateRangeGen =>
    from start in Arb.Generate<DateTime>()
    from end in Arb.Generate<DateTime>()
    where end > start
    select new DateRange(start, end);

This simple rule—symmetry—often uncovers off-by-one errors around inclusive/exclusive end dates and DST boundaries.

4.3.2 Floating Point arithmetic issues and using decimal for money.

Floating-point arithmetic is a common source of subtle bugs in financial systems. A property that compares decimal to double quickly demonstrates why the domain uses decimal:

[Property]
public bool FloatingPointIsUnreliable(decimal price, int qty)
{
    double fp = (double)price * qty;
    decimal correct = price * qty;
    return (decimal)fp != correct;
}

This prevents regressions where future developers or third-party libraries slip floating-point math back into money calculations. Even minor differences can cascade into large discrepancies in real systems.


5 Advanced Patterns: State Machine Testing (Model-Based Testing)

Property-based testing works well for pure functions, but many real systems hold internal state—caches, queues, aggregates, session objects, background processors, and anything with mutable data. Once state enters the picture, stateless properties are no longer enough. Testing a single call doesn’t tell you whether the system behaves correctly after a long sequence of operations. Model-Based Testing (MBT) fills this gap by comparing the system to a simple model that expresses the expected behavior over time.

5.1 The Limitations of Stateless Properties

Stateless properties assume that each test is independent. This works for pure functions like sorting or math utilities, but breaks down once operations depend on earlier ones. For example, calling Dequeue on a queue only makes sense if something was previously enqueued. Calling Remove on an empty cache creates edge cases that single-call tests won’t catch.

5.1.1 Why functional properties fail to catch temporal bugs (state corruption).

Temporal bugs only appear after sequences of interactions. These are the kinds of issues that often show up in production even when all unit tests pass:

  • A cache evicts the wrong entry after a specific pattern of puts and gets.
  • A resource pool leaks items after alternating acquire/release cycles.
  • A custom queue corrupts its internal index after a series of mixed enqueues and dequeues.

These problems aren’t about a single operation—they’re about the correctness of state transitions. MBT attacks this by expressing your system as a sequence of commands and checking whether the real system’s state matches a simplified, trustworthy model.

5.2 Introduction to Model-Based Testing (MBT)

Model-based testing treats your system as something driven by commands:

  1. A model that defines the ideal behavior.
  2. The system under test (SUT) that has real complexity, optimizations, or side effects.

FsCheck generates random sequences of commands and runs them against both systems, verifying that they end up in equivalent states.

5.2.1 The Concept: Comparing a simple Model (e.g., a List) against a complex System Under Test (e.g., a database-backed Queue).

Suppose you built a custom queue that persists data to disk or a database. The implementation might be complicated, but the behavior should still match a simple list-based queue:

public class ModelQueue<T>
{
    private readonly List<T> _items = new();

    public void Enqueue(T item) => _items.Add(item);
    public T Dequeue() { var v = _items[0]; _items.RemoveAt(0); return v; }
    public int Count => _items.Count;
}

The SUT might involve locking, durability, or caching, but logically it should always behave like this minimal model. If the SUT ever diverges, the model catches it.

5.3 Implementing ICommand in FsCheck

Model-based testing in FsCheck represents actions as commands. Each command describes:

  • When it can run (preconditions)
  • How it changes the model
  • How it changes the SUT
  • What must be true afterward

5.3.1 Defining Commands: Add, Remove, Update.

Here’s a simple AddCommand that enqueues a value:

public class AddCommand : ICommand<ModelQueue<int>, IQueue<int>>
{
    public int Value { get; }

    public AddCommand(int value) => Value = value;

    public bool Pre(ModelQueue<int> m) => true;

    public void RunActual(IQueue<int> sut) => sut.Enqueue(Value);
    public void RunModel(ModelQueue<int> m) => m.Enqueue(Value);

    public Property Post(ModelQueue<int> m, IQueue<int> sut)
        => (m.Count == sut.Count).ToProperty();
}

We can create a generator for commands just like we do for domain objects:

public static Gen<ICommand<ModelQueue<int>, IQueue<int>>> CommandGen =>
    Arb.Generate<int>().Select(v => (ICommand<ModelQueue<int>, IQueue<int>>)new AddCommand(v));

This allows FsCheck to produce long, varied sequences of actions that test the queue thoroughly.

5.3.2 Defining the NextState logic.

Some commands only make sense in certain states. For example, you can only dequeue from a queue that has items. Preconditions express these constraints:

public class DequeueCommand : ICommand<ModelQueue<int>, IQueue<int>>
{
    public bool Pre(ModelQueue<int> m) => m.Count > 0;

    public void RunActual(IQueue<int> sut) => sut.Dequeue();
    public void RunModel(ModelQueue<int> m) => m.Dequeue();

    public Property Post(ModelQueue<int> m, IQueue<int> sut)
        => (m.Count == sut.Count).ToProperty();
}

FsCheck uses Pre to avoid generating invalid operations. This keeps sequences realistic—mirroring how actual users or systems interact with the structure.

5.3.3 Executing random sequences of commands to find race conditions or state corruption.

With commands in place, FsCheck can run long sequences to see how the model and SUT diverge:

[Property]
public Property QueueBehavesLikeModel()
{
    return new CommandSequence<ModelQueue<int>, IQueue<int>>(
        new ModelQueue<int>(),
        new PersistentQueue<int>())
        .ToProperty();
}

If there’s a bug deep in the sequence—perhaps after 50 operations—FsCheck shrinks the list of commands down to the shortest set that still breaks the property. This is a huge advantage compared to debugging large, hand-written integration tests.

5.4 Case Study: Verifying a Custom Thread-Safe Cache

Thread-safe components are notoriously difficult to test. Their correctness depends on how operations interleave, and deterministic unit tests rarely trigger the problematic patterns. Model-based testing gives you a structured way to explore these interactions.

5.4.1 Detecting concurrency issues by running command sequences in parallel (FsCheck experimental features).

Assume a simple thread-safe cache interface:

public interface IThreadSafeCache
{
    void Put(string key, string value);
    string? Get(string key);
    void Remove(string key);
}

The model is a basic dictionary:

public class CacheModel
{
    private readonly Dictionary<string, string> _store = new();
    public void Put(string k, string v) => _store[k] = v;
    public string? Get(string k) => _store.TryGetValue(k, out var v) ? v : null;
    public void Remove(string k) => _store.Remove(k);
}

Commands mirror puts, gets, and deletes—just like in the queue example. The interesting part is running them concurrently to expose race conditions:

[Property(MaxTest = 200)]
public Property CacheIsThreadSafe()
{
    return Command.Parallel(
        CommandGeneratorForCache(),
        initial: new CacheModel(),
        sut: new ConcurrentCache());
}

Parallel sequences often reveal problems like:

  • Lost updates
  • Stale reads
  • Incorrect removal ordering
  • Corrupted internal state

When a race condition appears, FsCheck shrinks the parallel sequence to the smallest pair of operations that cause the inconsistency—something extremely difficult to do manually.


6 Bridging the Gap: Combining Existing Data with Randomness

Property-based testing doesn’t replace existing test strategies—it strengthens them. Most teams already rely on integration tests, curated datasets, and fixtures that reflect real production scenarios. These tests are valuable, but they often cover only the cases someone thought about. PBT fills the gaps by exploring variations those datasets never include. This section looks at how to combine randomness with deterministic test suites, reproduce failures reliably in CI/CD, and keep performance manageable.

6.1 Seeds and Determinism in CI/CD

Randomness is powerful, but it also introduces uncertainty. When a test fails once and then never again, it becomes hard to trust the suite. Deterministic replay removes the guesswork by giving you everything needed to reproduce the exact failure locally.

6.1.1 The “Flaky Test” dilemma: Handling random failures in the pipeline.

A flaky property test usually happens when the underlying bug appears only for a narrow set of inputs. FsCheck might generate that input once, uncover the problem, shrink it, and fail the test. But on the next run, the generator might not produce anything close to that value again. From CI’s perspective, this looks like a random, unreproducible error.

Seeds solve this. FsCheck prints the exact seed that produced the failure, along with the test parameters. That seed fully describes the path the generator followed. Once you have it, the “random” failure becomes deterministically repeatable. This turns a flaky test into a reproducible one.

6.1.2 Logging the Replay seed in xUnit output.

When a property fails, FsCheck emits a line like:

REPLAY: 56, 123456789

You can plug this directly back into a property:

[Property(Replay = "56,123456789")]
public bool Example(int x) => x + 1 > x;

Most CI systems automatically capture test output, so the replay seed ends up in logs or build artifacts. Some teams even parse these logs and surface the seed clearly to make debugging easier. The seed is essentially a snapshot of the random generator’s internal state.

6.1.3 Strategies for reproducing a CI failure locally.

To replay a failure reliably, your local environment needs to resemble CI as closely as possible:

  • Use the same .NET runtime version.
  • Match locale or time zone settings if the domain involves dates or formatting.
  • Run tests from the command line to ensure consistent configuration.

Once aligned, drop the seed into the failing property:

[Property(Replay = "90,99887766", MaxTest = 1)]
public void ReplayFailure(MyDomainType x) => MyFunction(x);

Setting MaxTest = 1 ensures you’re reproducing exactly the scenario that failed in CI, without running the entire property again.

6.2 Hybrid Testing Approaches

Property-based testing works best when combined with example-driven tests. They complement each other: PBT provides broad exploration, while example tests handle known business scenarios and strict integration boundaries.

6.2.1 Using PBT to generate test data for classic integration tests.

Integration tests often rely on hand-crafted objects, which takes effort and risks missing edge cases. Instead, FsCheck can generate realistic domain models as input to integration tests:

public class IntegrationTests
{
    [Property]
    public void SaveAndLoadOrder_UsesRealisticData(Order order)
    {
        var repo = new SqlOrderRepository(ConnectionString);
        repo.Save(order);
        var loaded = repo.Load(order.Id);

        Assert.Equal(order, loaded);
    }
}

Here, PBT supplies rich input combinations, and the integration test ensures the persistence layer behaves correctly. This approach often exposes mapping issues, serialization mismatches, and database constraint violations that wouldn’t appear with a handful of manually written test cases.

6.2.2 “Fuzzing” existing API endpoints using PBT generators.

API endpoints are ideal candidates for randomized testing because they rely on deserialization, validation, and middleware. With generated domain objects, you can effectively “fuzz” the API:

[Property(MaxTest = 50)]
public async Task ApiHandlesRandomCarts(Cart cart)
{
    var json = JsonSerializer.Serialize(cart);
    var response = await _client.PostAsync("/cart/calculate",
        new StringContent(json, Encoding.UTF8, "application/json"));

    Assert.True(response.IsSuccessStatusCode);
}

This finds issues like:

  • Payloads the model binder can’t parse
  • Missing validation rules
  • Serialization mismatches
  • Controller logic that assumes certain fields always exist

It’s a simple but powerful way to harden API boundaries without building a separate fuzzer.

6.3 Performance Considerations

Randomized tests introduce overhead. The more complex the generators and the more work each property performs, the slower the test suite becomes. Good tuning keeps PBT practical at scale.

6.3.1 Managing test execution time (tuning MaxNbOfTest).

FsCheck’s default of 100 tests per property is a solid baseline, but it’s not appropriate for every test. If a property performs heavy computations or interacts with I/O, reducing the number of runs keeps suites fast:

[Property(MaxTest = 30)]
public bool ExpensiveProperty(MyAggregate agg) => Validate(agg);

You can also apply higher test counts only to the properties that guard critical pieces of logic—like price calculations or discount invariants—and leave others at lower counts.

6.3.2 Profiling generator performance.

Generators can be a hidden source of slowdown, especially when they depend heavily on filtering or build deep recursive structures. Profiling them directly is straightforward:

var sw = Stopwatch.StartNew();
var sample = Arb.Generate<MyType>().Sample(1000, 50);
sw.Stop();
Console.WriteLine($"Generated 1000 samples in {sw.ElapsedMilliseconds}ms");

If generation is slow, the usual causes are:

  • Overusing Where filters that discard many values
  • Building unnecessarily large object graphs
  • Excessive recursion without size control

Rewriting generators to construct valid data directly almost always improves performance.


7 Architectural Implications: Designing for Testability

Property-based testing gradually shapes the way teams design APIs and domain models. Once you start expressing behavior through invariants and properties, it becomes clear which parts of the architecture are easy to test and which parts resist predictable behavior. Systems that adopt PBT tend to evolve toward clearer boundaries, better-defined types, and simpler state management. This section looks at what changes naturally when PBT becomes part of a team’s development workflow.

7.1 How PBT influences API Design

PBT pushes APIs toward designs that leave less room for ambiguity. When random inputs flow through an API, any hidden assumptions show up quickly—null checks, incomplete validation, optional fields that “should never be null,” or objects in partially valid states. Properties fail when APIs allow too many undefined states, so teams naturally move toward more precise domain modeling.

7.1.1 Pushing towards “Make Illegal States Unrepresentable.”

One of the strongest architectural effects of PBT is the push toward types that encode the domain rules directly. The more your types prevent invalid data, the easier it becomes to write meaningful properties. For example, instead of treating email addresses as arbitrary strings, you can enforce validation at construction time:

public record EmailAddress
{
    public string Value { get; }

    private EmailAddress(string value) => Value = value;

    public static EmailAddress Create(string input)
    {
        if (!input.Contains("@"))
            throw new ArgumentException("Invalid email");
        return new EmailAddress(input);
    }
}

With this approach, FsCheck doesn’t need to spend time generating invalid values or filtering them out. Properties become sharper because they only test behavior the domain actually allows. Failures then reflect real problems, not noise caused by unrealistic inputs.

7.1.2 The correlation between pure functions and easy property testing.

Pure functions—those without side effects—are the easiest parts of a system to verify with properties. When behavior depends only on the inputs, you get repeatable, deterministic outcomes. Debugging shrinks becomes straightforward because nothing external influences the result.

As teams adopt PBT, they often start isolating the pure logic from the I/O layers:

public decimal CalculateDiscount(decimal subtotal, DiscountPolicy policy)
{
    return subtotal * (1 - policy.Rate);
}

This separation aligns naturally with Clean Architecture or domain-driven design. Business rules live in pure functions, while state access happens at the edges. The result is both better architecture and significantly easier testability.

7.2 Dependency Injection and Mocking in PBT

Traditional mocks are tuned for example-based tests where specific call sequences matter. In property-based testing, you don’t always know what values FsCheck will generate or how many times a method will run. Strict mocks often fail for reasons unrelated to the actual behavior you care about.

7.2.1 Why Mocking frameworks (Moq/NSubstitute) struggle with PBT.

Mocks tend to encode expectations about the exact number of calls or specific inputs. Under PBT’s randomized execution, these assumptions become brittle. A single unnecessary interaction verification can cause otherwise correct tests to fail:

mock.Verify(x => x.Log(It.IsAny<string>()), Times.Once);

If random input takes your code down a different path, this verification breaks—even if the system behaves correctly. This forces developers to loosen or remove expectations, which often reveals that mocks were testing implementation details rather than actual behavior.

PBT works best when collaborators behave predictably and don’t constrain the shape of the randomized inputs.

7.2.2 Using Fakes and Stubs with generated data.

Fakes are a better fit for PBT because they store data in memory and follow simple, consistent rules. They scale naturally across arbitrary generated inputs:

public class InMemoryUserRepo : IUserRepo
{
    private readonly Dictionary<Guid, User> _users = new();

    public void Save(User u) => _users[u.Id] = u;
    public User Load(Guid id) => _users[id];
}

This kind of fake adapts to any user object FsCheck generates and doesn’t enforce brittle interaction expectations. Properties stay focused on business behavior, not mock setup.

7.3 Refactoring Legacy Code using PBT

Legacy systems often mix domain rules with infrastructure concerns, and the behavior of the system may not be fully documented. Refactoring these systems safely requires confidence that the new version behaves exactly like the old one. Property-based testing provides a concise and thorough way to guarantee this.

7.3.1 The “Golden Master” technique: Using PBT to ensure refactored code matches legacy behavior exactly (Old(x) == New(x)).

In a Golden Master setup, both the old and new implementations run against the same randomized inputs. If outputs ever differ, the property breaks:

[Property]
public bool NewEngineMatchesOldEngine(InputModel input)
{
    var oldResult = OldEngine.Calculate(input);
    var newResult = NewEngine.Calculate(input);

    return oldResult == newResult;
}

Because FsCheck generates a wide range of inputs—including edge cases developers rarely think about—this technique provides strong guarantees during refactoring. Any behavioral drift becomes immediately visible, and shrinking highlights the minimal input that exposes the difference. It’s a practical safety net for large-scale rewrites.


8 Conclusion and Strategic Adoption

Property-based testing strengthens a system’s reliability by exploring a wider range of inputs, validating core invariants, and revealing edge cases that example-based tests rarely uncover. Teams that adopt PBT often notice fewer production issues and greater confidence when modifying critical domain logic. Once the mindset shifts from checking specific examples to enforcing general laws, testing becomes both more systematic and more meaningful.

8.1 Summary of Key Concepts

This article walked through the transition from example-driven tests to property-driven thinking. We examined what properties represent, how generators and shrinkers work, and how FsCheck integrates naturally into .NET and xUnit. We also applied PBT to realistic scenarios—discount engines, serialization rules, date ranges, queues, and thread-safe caches—showing how properties reveal assumptions that are easy to miss in traditional test suites. Together, these concepts give .NET teams a practical way to express expected behavior and verify it across a broad input space.

8.2 A Roadmap for Team Adoption

Like any new testing approach, PBT is easier to adopt gradually. Teams that start small and expand over time build stronger generators, clearer invariants, and more maintainable properties. A phased adoption strategy provides the smoothest path.

8.2.1 Phase 1: Use PBT for Utility classes and Mappers.

Utility functions, data mappers, and pure helpers are perfect entry points. They often follow predictable mathematical rules—idempotence, reversibility, ordering guarantees, or round-trip conversions. These properties are simple to articulate and immediately highlight the strengths of PBT without requiring complex generators.

8.2.2 Phase 2: Domain logic and Business Rules.

Once the team feels comfortable, apply PBT to richer domain logic—discount engines, pricing rules, eligibility checks, aggregation rules, and any logic that has well-defined invariants. This is where PBT delivers significant value. FsCheck explores conditions that hand-written tests rarely consider, uncovering assumptions hidden deep in the business rules.

8.2.3 Phase 3: Stateful/Model-Based testing for core infrastructure.

The final step is applying model-based testing to components with state: caches, queues, background workers, database-backed aggregates, and concurrency-sensitive code. These systems produce complex behavior over time, and MBT is uniquely positioned to validate that behavior. By comparing a real implementation to a simple model, FsCheck can detect state corruption, incorrect transitions, and rare race conditions that example-based tests never trigger.

8.3 When Not to use Property-Based Testing

PBT is powerful, but it isn’t a universal solution. Some domains simply don’t lend themselves to property-driven specifications, either because meaningful invariants don’t exist or because outputs can’t be described precisely enough.

8.3.1 UI Testing, simple CRUD with no logic, and areas where “Oracles” are impossible to define.

User interfaces often rely on visual or subjective criteria that don’t translate well into universal properties. Similarly, simple CRUD operations with no domain logic provide few useful invariants—writing properties for them adds little value. In other cases, you may not have a clear oracle: if you can’t express what must always be true, property-based testing can’t provide meaningful guidance. In those areas, example-based tests and integration tests remain the appropriate tools.

8.4 Final Thoughts on Software Quality Assurance.

Property-based testing doesn’t replace traditional tests—it enhances them. Example tests are still the best way to demonstrate specific expected scenarios, while PBT ensures those scenarios generalize across unpredictable variations. Over time, properties act as executable documentation for core domain rules, giving teams a safety net as systems evolve. The habit of thinking in invariants leads to cleaner code, clearer APIs, and architectural designs that express the domain more faithfully. When combined with the tooling in FsCheck and the xUnit ecosystem, PBT becomes a practical, dependable approach to improving software quality and long-term maintainability.

Advertisement