Rust Benchmarking with Criterion.rs

Benchmarking is an essential practice when it comes to optimizing code, especially in systems programming languages like Rust, where performance is paramount. Rust's focus on memory safety and zero-cost abstractions gives developers the power to write incredibly efficient programs.

However, ensuring that your code performs as expected, or improving its performance, requires a reliable way to measure it. This is where Criterion.rs comes in. Criterion.rs is a powerful and statistically-driven benchmarking library that allows you to detect even the smallest performance changes in your code.

In this blog post, we'll explore how to use Criterion.rs to write and analyze benchmarks, helping you write faster and more efficient Rust code.

What is Benchmarking?

Benchmarking in programming refers to testing and measuring how efficiently a piece of code or program performs under specific conditions. For example, you might want to measure how long it takes for a function to sort a large dataset, or how quickly a web server responds to requests under heavy traffic.

By running these tests, you can get concrete data on execution speed and resource usage. A common practice is to run these benchmarks multiple times, as results may vary between runs due to factors like system load. Benchmarking is essential for identifying bottlenecks, optimizing algorithms, and ensuring that your code performs well in real-world scenarios.

Learn Rust by Practice

Master Rust through hands-on coding exercises and real-world examples.

Why Should You Benchmark Your Code?

Benchmarking is important because it gives you insights into the performance of your code and helps prevent inefficiencies. For instance, if you're developing a game, you might need to know how fast your physics engine can calculate collisions as the number of objects increases. Or, when building a database, you'd want to benchmark how quickly it processes queries with different datasets.

Benchmarking lets you compare different approaches, such as evaluating if a new algorithm improves search performance or if switching libraries reduces memory usage. Tools like Criterion.rs and others allow you to systematically run these tests, helping you make informed decisions and write faster, more efficient code.

Getting Started with Criterion.rs

Let's dive into how you can use Criterion.rs to benchmark your Rust code. We'll walk through setting up Criterion.rs, writing a benchmark for a sort function and comparing them, and analyzing the results.

You can see the full code in the GitHub repository.

Objective

In this guide, we'll compare the performance of two sorting algorithms: a custom Merge Sort and Rust's built-in sort function. This will give you a clear understanding of how to use Criterion.rs to compare the performance of different implementations.

Create a New Rust Project

First, create a new Rust project by running the following command:

cargo new criterion-test

This will generate a basic Rust project with the following structure:

criterion-test/
├── Cargo.toml
└── src
    └── main.rs

Add Criterion.rs as a Development Dependency

Add Criterion.rs as a dev dependency to your project by running:

cargo add -D criterion -F html_reports

The feature html_reports allows Criterion to generate HTML reports for visualizing benchmark results.

Update your Cargo.toml file to include the benchmark configuration:

[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
 
[[bench]]
name = "sort_benchmarks"
harness = false

This configures a custom benchmark called sort_benchmarks that will run without the default test harness.

Create Your Benchmark

Benchmarks in Rust are typically placed inside the benches directory, the file should be the same name as the benchmark you defined in Cargo.toml. Create this directory and add a new benchmark file:

mkdir benches
touch benches/sort_benchmarks.rs

Now, define your benchmark in benches/sort_benchmarks.rs. In this example, we will compare the performance of a custom Merge Sort implementation with Rust's built-in sort function:

use criterion::{criterion_group, criterion_main, Criterion};
use std::hint::black_box;
 
// Merge Sort implementation
fn merge_sort(mut arr: Vec<i32>) -> Vec<i32> {
    if arr.len() <= 1 {
        return arr;
    }
    let mid = arr.len() / 2;
    let left = merge_sort(arr[..mid].to_vec());
    let right = merge_sort(arr[mid..].to_vec());
    merge(&left, &right, &mut arr);
    arr
}
 
fn merge(left: &[i32], right: &[i32], arr: &mut Vec<i32>) {
    let mut left_index = 0;
    let mut right_index = 0;
    let mut index = 0;
 
    while left_index < left.len() && right_index < right.len() {
        if left[left_index] < right[right_index] {
            arr[index] = left[left_index];
            left_index += 1;
        } else {
            arr[index] = right[right_index];
            right_index += 1;
        }
        index += 1;
    }
 
    while left_index < left.len() {
        arr[index] = left[left_index];
        left_index += 1;
        index += 1;
    }
 
    while right_index < right.len() {
        arr[index] = right[right_index];
        right_index += 1;
        index += 1;
    }
}
 
fn criterion_benchmark(c: &mut Criterion) {
    let data: Vec<i32> = (0..1000).rev().collect();  // Reversed array of 1000 elements
 
    // Benchmark Merge Sort
    c.bench_function("merge_sort", |b| {
        b.iter(|| merge_sort(black_box(data.clone())))
    });
 
    // Benchmark Rust's built-in sort
    c.bench_function("built_in_sort", |b| {
        b.iter(|| {
            let mut data_clone = data.clone();
            black_box(data_clone.sort());
        })
    });
}
 
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

Run the Benchmark

To run your benchmark, use the following command:

cargo bench

This will execute the benchmark and provide performance data comparing the two sorting algorithms. If you have gnuplot installed, Criterion.rs will also generate detailed HTML reports for better visualization and analysis of the results.

The results will help you understand how a custom Merge Sort compares to Rust's built-in sort in terms of performance. The HTML reports offer an easy way to visualize trends and spot any performance differences.

Benchmark logs Benchmark logs

HTML Reports with Criterion.rs

Criterion.rs provides the ability to generate an HTML report that visualizes the results of your benchmarks. Once you run your benchmarks, the report can be found under target/criterion/report/index.html. By default, Criterion.rs uses gnuplot to create these charts, but if gnuplot isn't available, it falls back on the plotters crate. Both options generate clear and informative graphs, allowing you to visualize how your code performs.

These reports offer a detailed view of the benchmark data, including execution time, statistical analysis, and visual charts that highlight trends and outliers. This makes it easy to spot performance issues or improvements in your code.

Installing Gnuplot on Ubuntu (Optional)

To generate detailed visual reports with Criterion.rs, you need to install gnuplot on your system. Here's how you can do it on Ubuntu:

# Update Package Lists
sudo apt update
 
# Install Gnuplot
sudo apt install gnuplot
 
# Verify the Installation
gnuplot --version

Here is the generated HTML reports for the two sorting algorithms:

Build In Sort Report Build In Sort Report Merge Sort Report Merge Sort Report

Conclusion of the Benchmark Reports

The reports generated for the built-in sort and the merge sort provide clear insights into the performance of both algorithms. Here are the key takeaways:

  1. Built-in Sort:

    • The built-in sort has an average time of 694 nanoseconds (ns), which is significantly faster compared to merge sort.
    • It displays consistent performance, with a low median and standard deviation, suggesting that the algorithm performs reliably across multiple iterations.
    • The graph of total sample time vs. iterations shows a steady linear increase, indicating the built-in sort scales well with increasing iterations.
  2. Merge Sort:

    • The merge sort is considerably slower, with an average time of 36.1 microseconds (µs), making it orders of magnitude slower than the built-in sort.
    • Despite the higher time, merge sort shows a high degree of consistency in its performance, with a low standard deviation relative to its total execution time.
    • The linear regression graph shows a similar steady performance as the built-in sort, though with a steeper slope due to the higher computational cost of the merge sort algorithm.

Overall Analysis

  • Performance: The built-in sort is vastly more efficient in terms of speed compared to the custom merge sort implementation. The faster average time (694 ns vs. 36.1 µs) makes it a clear winner in performance.
  • Scalability: Both algorithms exhibit linear performance scalability, meaning their execution time grows steadily with the number of iterations.
  • Consistency: While both sorts are consistent in their performance, the built-in sort shows a tighter range of execution times, with lower variance.

In conclusion, for most use cases, the built-in sort function is a much better choice in terms of both performance and reliability. However, the merge sort might still be relevant in specific cases where its divide-and-conquer approach is necessary for custom sorting scenarios.

Benchmark Conclusion

The Criterion.rs benchmark reports provide a detailed statistical analysis of both sorting algorithms, allowing us to easily compare their performance. The built-in sort function is far more efficient than the custom merge sort implementation, as evidenced by the much lower average execution time and more consistent results. The ability to visualize these benchmarks with graphs and detailed statistics helps developers make informed decisions about performance optimizations and track changes over time.

Additional Features of Criterion.rs

In addition to basic benchmarking, Criterion.rs offers a range of advanced features for deeper performance analysis and flexibility. Here are some of the key tools and capabilities it provides:

  • Command-Line Output
  • HTML Reports
  • Plots & Graphs
  • Benchmarking with Inputs
  • Comparing Functions
  • Timing Loops
  • Custom Measurements
  • Profiling

These features make Criterion.rs an incredibly versatile tool for performance benchmarking, catering to a variety of use cases and customization needs. You can read more about these features in the official Criterion.rs documentation.

Conclusion

Benchmarking is an essential practice for optimizing code and ensuring high performance, and Criterion.rs offers an excellent solution for Rust developers. With its rich set of features, from basic benchmarking to advanced statistical analysis, it provides detailed insights into how your code performs. In this blog, we explored how to set up Criterion.rs, compared the performance of merge sort and Rust's built-in sort, and examined the results using the HTML reports generated by Criterion.rs.

As we've seen, the built-in sort function significantly outperforms the custom merge sort implementation, and the detailed reports generated by Criterion.rs provide valuable data on execution times, consistency, and scalability. With additional features like command-line options, benchmarking asynchronous functions, and custom measurements, Criterion.rs is a versatile tool that can handle a wide range of benchmarking needs.

If you found this blog helpful, you might also enjoy exploring our guide to Rust testing libraries to improve your testing strategies. And don't forget to sharpen your skills by practicing Rust challenges directly in your browser with Rustfinity's interactive challenges.

Learn Rust by Practice

Master Rust through hands-on coding exercises and real-world examples.

Check out our blog

Discover more insightful articles and stay up-to-date with the latest trends.

Subscribe to our newsletter

Get the latest updates and exclusive content delivered straight to your inbox.