Optimizing C++ Word Count
Yesterday’s post showed that the original C++ version is around 13 times slower than my non-optimized Rust version. Next, I’ll show how to get easily equivalent performance to Rust.
One problem that I immediately made out when first seeing the C++ version is
that it uses std::regex
. std::regex
is known to be horribly slow 1.
Just replacing the checking of the file name extension with the in-built
e.path().extension() != ".txt"
speeds up from 373 ms to 52ms!
Before:
const std::regex fname_regex("\\.txt$", std::regex_constants::icase);
// ...
if (!std::regex_search(e.path().c_str(), fname_regex)) {
continue;
}
After:
if (e.path().extension() != ".txt") {
continue;
}
Next, I set out to replace the splitting into words with std::regex
. I decided
to use CTRE because I have never had a chance to use it before. CTRE stands for
Compile-Time Regular Expressions, and means that the matching pattern is
compiled into a regex matcher.
Unfortunately, I could not get Conan to work with it2. So, I copied the single header file (apologies!).
The result looks like this:
static constexpr auto word_regex = ctll::fixed_string{"([a-zA-Z]{2,})"};
// ...
for (auto match : ctre::range<word_regex>(line)) {
auto word = match.get<0>();
++word_counts[std::string(word)];
}
Numbers
The final performance numbers including Go 3 show C++, Rust and Go fairly close to each other. Please run your own benchmarks (use hyperfine) on your relevant workload. I suspect that Go’s runtime takes a few milliseconds to start.
Needless to say, I made sure that all comparisons were done on the same directory and with release builds.
On the C++ side we went from 373 ms to 52ms to 23ms, a 15x improvement.4
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
cpp/word-count |
23.0 ± 1.9 | 20.3 | 32.7 | 1.00 |
rust/words |
26.9 ± 1.6 | 23.6 | 31.2 | 1.17 ± 0.12 |
go/wordcount |
43.5 ± 3.3 | 39.2 | 55.6 | 1.89 ± 0.21 |
swift/word-count |
84.3 ± 5.0 | 79.4 | 103.5 | 3.67 ± 0.37 |
Update
After publishing this, I saw this post by Jussi Pakkanen, where they make the same changes to the extension checking.
And I realize that the point of their exercise was to only use the C++ Standard Library. That necessarily precludes the usage of CTRE.
-
I recommend watching Hana Dusíková's talks on her CTRE library. Brilliant! ↩︎
-
This is the third try I gave Conan and there’s always something not quite working. In this case, Conan was expecting clang 9 but I had clang 12. Clang 12 isn’t known to Conan. Then, CTRE was was expecting C++17 and even setting
cppstd=17
did not help. ↩︎ -
And when running the Go version for the first time, I get a runtime error. Irony.
panic: runtime error: slice bounds out of range [:10] with capacity 0
. Running it in a directory with txt-files prevents the error. ↩︎ -
For the curious, I generated the table with the following command:
hyperfine --warmup=2 /full/path/to/rust /full/path/to/swift ... --export-markdown version1.md
. Providingwarmup=2
could be controversial. All of it was on macOS 10.15. ↩︎