Optimizing C++ Word Count

Yesterday’s post showed that the original C++ version is around 13 times slower than my non-optimized Rust version. Next, I’ll show how to get easily equivalent performance to Rust.

One problem that I immediately made out when first seeing the C++ version is that it uses std::regex. std::regex is known to be horribly slow 1.

Just replacing the checking of the file name extension with the in-built e.path().extension() != ".txt" speeds up from 373 ms to 52ms!

Before:

const std::regex fname_regex("\\.txt$", std::regex_constants::icase);
// ...
if (!std::regex_search(e.path().c_str(), fname_regex)) {
    continue;
}

After:

if (e.path().extension() != ".txt") {
    continue;
}

Next, I set out to replace the splitting into words with std::regex. I decided to use CTRE because I have never had a chance to use it before. CTRE stands for Compile-Time Regular Expressions, and means that the matching pattern is compiled into a regex matcher.

Unfortunately, I could not get Conan to work with it2. So, I copied the single header file (apologies!).

The result looks like this:

static constexpr auto word_regex = ctll::fixed_string{"([a-zA-Z]{2,})"};
// ...
for (auto match : ctre::range<word_regex>(line)) {
   auto word = match.get<0>();
   ++word_counts[std::string(word)];
}

Numbers

The final performance numbers including Go 3 show C++, Rust and Go fairly close to each other. Please run your own benchmarks (use hyperfine) on your relevant workload. I suspect that Go’s runtime takes a few milliseconds to start.

Needless to say, I made sure that all comparisons were done on the same directory and with release builds.

On the C++ side we went from 373 ms to 52ms to 23ms, a 15x improvement.4

Command Mean [ms] Min [ms] Max [ms] Relative
cpp/word-count 23.0 ± 1.9 20.3 32.7 1.00
rust/words 26.9 ± 1.6 23.6 31.2 1.17 ± 0.12
go/wordcount 43.5 ± 3.3 39.2 55.6 1.89 ± 0.21
swift/word-count 84.3 ± 5.0 79.4 103.5 3.67 ± 0.37

Update

After publishing this, I saw this post by Jussi Pakkanen, where they make the same changes to the extension checking.

And I realize that the point of their exercise was to only use the C++ Standard Library. That necessarily precludes the usage of CTRE.


  1. I recommend watching Hana Dusíková's talks on her CTRE library. Brilliant! ↩︎

  2. This is the third try I gave Conan and there’s always something not quite working. In this case, Conan was expecting clang 9 but I had clang 12. Clang 12 isn’t known to Conan. Then, CTRE was was expecting C++17 and even setting cppstd=17 did not help. ↩︎

  3. And when running the Go version for the first time, I get a runtime error. Irony. panic: runtime error: slice bounds out of range [:10] with capacity 0. Running it in a directory with txt-files prevents the error. ↩︎

  4. For the curious, I generated the table with the following command: hyperfine --warmup=2 /full/path/to/rust /full/path/to/swift ... --export-markdown version1.md. Providing warmup=2 could be controversial. All of it was on macOS 10.15. ↩︎