> Rust also has more potential for performance scaling in this space, but with the "if you have the time for that" kind of caveat.
When you have a CLI for some network-based service, the network latency is going to be magnitudes larger than the performance differences between Go and Rust anyway. (Or at least in 99% of cases.)
> Sure, but by the same token, you can't conclude Rust code is never faster than Go/Java/whatever for real-world code on account of some TechEmpower benchmark, as the parent poster was trying to do.
A very good point, and I had overlooked what the parent poster had said!
Rust is clearly part of the top performance category and I would not be surprised if with some work it were to become (momentarily) faster by some small amount than Java, Go, etc. I say "momentarily" here because many of the performance-oriented frameworks and platforms are continuously working on tuning within their stacks, so there is considerable volatility in the specific ordering over time.
> rustc might never be as fast as the Go compiler because the language has so many additional features
Not necessarily true. The d compiler runs incredibly fast, for compiling the type of code you'd write in go; and it only slows down if you use a lot of complicated features like templates or CTFE.
> The point is that I wrote naive approach in both languages and it's a lot faster in Go.
I tried your challenge, and the first data point I uncovered contradicts this. Here is the source code of both programs: https://gist.github.com/anonymous/f01fc324ba8cccd690551caa43... --- The Rust program doesn't use unsafe, doesn't explicitly use C code, is shorter than the Go program, faster in terms of CPU time and uses less memory. I ran the following:
$ /usr/bin/time -v ./lossolo-go /tmp/OpenSubtitles2016.raw.sample.en the
$ /usr/bin/time -v ./target/release/lossolo-rust /tmp/OpenSubtitles2016.raw.sample.en the
Both runs report 6,123,710 matching lines (out of 32,722,372 total lines). The corpus is ~1GB and can be downloaded here (266 MB compressed): http://burntsushi.net/stuff/OpenSubtitles2016.raw.sample.en.... --- My /tmp is a ramdisk, so the file is in cache and I'm therefore not benchmarking disk reads. My CPU is an Intel i7-6900K.
The Go program takes ~6.5 seconds and has a maximum heap usage of 7.7 MB. The Rust program takes ~4.2 seconds and has a maximum heap usage of 6 MB. (As measured by GNU time using `time -v`.)
---
IMO, both programs reflect "naive" solutions. The point of me doing this exercise is to show just how silly this is, because now we're going to optimize these programs, but we'll limit ourselves to smallish perturbations in order to put a reasonable bound on the task.
If I run the Go program through `perf record`, the top hotspot is runtime.mallocgc. Now, I happen to know from experience that Scanner.Text is going to allocate a new string while Scanner.Bytes will not. I also happen to know that the Go standard library `bytes` package recently got a nice optimization that makes bytes.Contains as fast as strings.Contains: https://github.com/golang/go/commit/44f1854c9dc82d8dba415ef1... --- Since reading into a Go `string` doesn't actually do any UTF-8 validation, we don't lose anything by switching to using raw bytes.
Now let's see if we can tweak Rust, which is now twice as slow as the Go program. Running perf, it looks like there's an even split between allocation, searching and UTF-8 validation, with a bit more towards searching. Like the Go program, let's attack allocation. In this case, I happen to know that the `lines` method returns an iterator that yields `String` values, which implies that it's allocating a fresh `String` for every line, just like our Go program was. Can we get rid of that? The BufReader API provides a `read_line` method, which permits the caller to control the `String` allocation. If we use that, our Rust program is tweaked to this: https://gist.github.com/anonymous/a6cf1aa51bf8e26e9dda4c50b0... --- It's not quite as symmetrical as a change as we made to the Go program, but it's pretty straight-forward IMO. Running the same command as above, we now get a time of ~3.3 seconds and a maximum heap usage of 6 MB.
OK, so we're still slower than the Go program. Looking at the profile again, the time now seems split completely between searching and UTF-8 validation. The allocation doesn't show up at all any more.
Is this where you got stuck? The next step from here isn't straight-forward because getting rid of the UTF-8 validation isn't possible to do safely while still using the String/&str search APIs. Notably, Rust's standard library doesn't provide a way to search an `&[u8]` directly using optimized substring search routines. Even if you knew your input was valid UTF-8 before hand, there's no obvious place to insert an unsafe `from_utf8_unchecked` because the BufReader itself is in control of producing the string contents. (You could do this by switching to using `BufReader.read_until` and then transmuting the result into an &str, but that would require unsafe.)
Let's take a leap. Rust's regex library has a little known feature that it can actually search the contents of an &[u8]. Rust's regex library isn't part of the standard library, but it is maintained as an official crate by the Rust project. If you know all of this, then it's possible to tweak the Rust program just a bit more to regain the speed lost by UTF-8 checking: https://gist.github.com/anonymous/bfa42d4f86e03695f3c880aace... --- Running the same command as above once again, we now get a time of ~2.1 seconds and a maximum heap usage of 6.5 MB.
In sum, we've beaten Go in CPU time, but lost the Battle for Memory and the battle for obviousness. Beating Go required noticing the `read_until` API of BufReader and knowing that 1) Rust's regexes are fast and 2) they can search &[u8] directly. It's not entirely unreasonable, but to be fair, I've done this without explicitly using any unsafe or any C code.
None of this process was rocket science. Both the Go and Rust programs were initially significantly sub-optimal because of allocation, but after some light profiling, it was possible to speed up both programs quite a bit.
---
Compared to the naive solution, some of our search tools can be a lot faster. Performing the same query on the same corpus:
The differences between real search tools and our naive solution actually aren't that big here. The reason why is because of your initial requirement that the query match lots of lines. Lots of matches results in a lot of overhead. If we change the query to a more common type of search that produces very few matches (e.g., `Sherlock Holmes`), then our best naive programs drop down to about ~1.4 seconds, but ripgrep drops to about 200 milliseconds.
From here, the next step would be stop parsing lines and start searching the entire buffer directly. (I hope to make even this task very easy by moving some of the searching code inside of ripgrep to an easy to use library.)
---
In sum, your litmus test essentially comes down to these trade offs:
- Rust provides a rich API for its String/&str types, which are guaranteed to be valid UTF-8.
- Rust lacks a rich substring search API in the standard library for Vec<u8>/&[u8] types. Because of this, efficient substring search using only the standard library has an unavoidable UTF-8 validation cost in safe code.
- Go doesn't do any kind of UTF-8 checking and provides mirrored substring search APIs between its `bytes` and `strings` packages.
- The actual performance of searching in both programs probably boils down to optimized SIMD algorithms. Therefore, once you get past the ability to search each line of a file with minimal allocation, you've basically hit a wall that's probably the same in most mainstream languages.
In my opinion, these trade offs strike me as something terribly specific, and it's probably not something that is usefully generalizable. More than that, in the naive case, Rust is doing you a good service by checking that your input is valid UTF-8, which is something that Go doesn't do. I think this could go either way, but I think it's uncontroversial that guaranteeing valid UTF-8 up front like this probably eliminates a few possibly subtle bugs. (I will say that my experience with text encoding in Go has been stellar though.)
Most importantly, both languages at least have a path to writing a very fast program, which is often what most folks end up caring about at the end of the day.
> Go’s garbage collector will often be faster than refcounting (or whatever manual memory management technique your Rust code ends up using)
I'm not supporting the argument that everything should be written in Rust (or whatever) for good performance. However blanket statement like this is not true; micro-benchmarks are often misleading. There are many factors which affect the performance and they come with tradeoffs, so you can choose what options favor you most. At the end, objectively Rust offers more ways to optimize your program.
It may be, but there's also the concurrency aspect which is hard to get right in C++. That's where the performance gains come from, partially, because 'fearless concurrency' is one of the Rust's tenets.
> Rust can be faster than C in the general case because of the optimizations permitted by, among other things, strict aliasing rules.
Sure. The same is true for Fortran, for example. It's just that the few benchmarks of Rust vs. C I have seen so far all seem rigged in one way or another.
> iterative compiles are faster than most of my golang compiles
Maybe you're comparing apples to oranges here. I've worked professionally with both Rust and Go for years now on a variety of real world projects, and I've never seen a similarly-sized Go codebase that compiles slower than a Rust one. If you're comparing incremental Rust compilation to first-time Go compilation, maybe they could be competitive, but... Rust is incredibly slow at compilation, even incremental compilation.
Yes, using lld can speed up Rust compilation because a lot of the time is often spent in the linker stage, but... that's not enough to make it as fast as Go.
YMMV, of course, but... my anecdotal experience would consider it disingenuous to say that Rust compile times are an advantage compared to Go, and I'm skeptical that Rust compile times are even an advantage compared to the notoriously slow webpack environments.
Rust is good at many things, but compilation speed is not one of them. Not even close, sadly. "cargo check" is tolerable most of the time, but since you can't run your tests that way, that's not actually compilation.
Generally, Rust "should" be faster, because it spends a lot of time on optimizations that Go doesn't do. That's what you're paying for in compile times. If Go is faster on some CPU bound workload despite doing a lot less optimization, that's interesting. (I should note that this is not the norm.)
It was based on lack of knowledge about Rust. I didn't know any of those things about Rust. I assumed it was similar to Go (garbage collected, at the least). As I mentioned, I don't know a lot about either language; less about Rust. Which is why I mentioned I'd like to see a discussion about why Rust was faster.
Sure, but by the same token, you can't conclude Rust code is never faster than Go/Java/whatever for real-world code on account of some TechEmpower benchmark, as the parent poster was trying to do.
>They might have some more comprehensive optimization passes eating extensive cycles, but even rustc debug builds are extremely slow in comparison to the Go standard compiler.
Work is not just the optimization.
Having Rust track lifetimes and warn about ownership bugs, races, etc, is also productive work for the compiler -- and happens during the debug builds too.
> Rust's compile times are incredibly painful too, and one of the worst offenders is the #1 json library for rust (serde).
Go and Rust sort of took opposite paths in terms of the compiler.
Rust spends the compilation time (for release compiles) to generate the fastest binaries the compiler can emit. The article talks about Go adding things like "A recent improvement was in version 1.17, which passes function arguments and results in registers" that's stuff that compilers like llvm have been doing pretty much at inception (20 years).
Go has optimized for compilation speed with the understanding that "Well, it's probably not CPU time that is REALLY making your software slow".
Go compiles faster and rust compiles faster results.
> One speed that Rust doesn't have on its side is compilation performance.
I noticed that `cargo check` alone is plenty fast, and when you give it a program with a type issue to check, it reports errors nearly instantaneously (< 0.5s).
So this is probably not all the type checking (including borrowchecking) that slows things down, but the phases that happen later, likely code generation. Which leads me to a question - why is code generation so slow even in debug mode compared to other compiled languages? (this is one thing that Golang does right).
It may not be a good enough reason to switch, but Rust is nearly always faster than Go. Most of the time orders of magnitude faster.
reply