Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

A linked list is still a pretty terrible choice. You'd probably be better off with a Vec with tombstones and some kind of free list.


sort by: page size:

Ugh, linked lists fuck everything up. It’s never the right data structure. Use a vector!

Okay first of, before I read the rest of the article, I want to complain about the statement:

> While you can hack together a “list” that will be backed by a Vec of nodes with indices to the next / previous item, this approach is quite wasteful – and gains little compared to using the Vec directly.

How is this wasteful or hacky?? This is THE proper way of writing data structures that are not strict trees. Sure, this isn't the correct way to write a list data structure, but there's almost no reason to use a linked list in general.


Because of cache locality issues, linked lists are a bad data structure for almost all purposes. A plain old vector, or maybe a VecDeque, wins for almost all workloads.

For the few times they are useful (e.g. if you're writing lock-free algorithms, you want a hard upper bound O(1) append rather than just amortized O(1) append, or you don't have malloc available), Rust is a fine language to write them in. But also there are plenty of high-quality libraries on crates.io, so you probably won't have to write your own.


Linked lists are often the wrong choice because they're rarely performant or efficient when compared to vecs in the real world.

In general, use vecs until you absolutely can't.

Perhaps there are a few uses for LL's in limited circumstances.


> Linked lists are great. But they have the problem that, almost always, whatever problem you're trying to solve would be better done with a regular resizable vector.

One important exemption is when you want your datastructure to be persistent.

See https://en.wikipedia.org/wiki/Persistent_data_structure

But for many applications there are also better persistant data structures than linked lists around.


That's a ridiculous statement. Linked lists have real performance benefits in some applications. A good "modern CS toolbox" includes the ability to make the right choice. Which, if you believe they are fundamentally useless, clearly you lack.

It's kind of funny that linked lists are almost never an appropriate choice on modern systems because of their poor locality and cache performance. Linked lists were a much better fit for 1970s and early 1980s microprocessors that almost never had caches (memory access time in the 1970s was similar to clock speed).

That means linked lists are increasingly relegated to toy problems and unusual domains. They're no longer really a practical tool to organize data - it would be better in many cases to use an array and copy the entire contents on insert.


Vectors really don't make linked lists irrelevant - if you want to be able to quickly pop from the middle, no Vec magic will currently help you

You basically want a linked list.

Not for searching. Linked lists are fine if you are at the point where you need to do an operation, but getting to that point can be expensive.

I’m pretty sure I read something on HN the other day that showed that a linked list is almost always a bad choice, even for small values, due to memory locality issues.

It's a trick question. We would never use a linked list in the real world.

Yes, it is. But why 'the worst' aspects of each, though? I see it as the opposite: linked lists built on top of vectors combine good cache efficiency of vectors with algorithmic benefits of linked lists (many operations become O(1), or at least in the amortized sense).

Of course, arenas do have tradeoffs - I'll try to summarize.

Pros:

- Completely safe.

- More ergonomic than dealing with `Rc<RefCell<T>>`.

- Improved cache efficiency of linked data structures.

- Fast node allocation and deallocation.

Cons:

- Any two simultaneous node accesses must be immutable.

- Accessing a node incurs the cost of a bound and liveness check.

- Vector growth can be costly and cause a spike in latency.

- Vector growth moves all nodes in memory.

- Removing a node creates a hole in the vector, possibly wasting memory.

Generally speaking, I think anyone learning Rust and trying to implement a graph data structure should first do it on top of a vector. It's easier than other commonly suggested approaches and it's not a bad one either. In fact, that's how rustc builds some of its data structures, how conrod (a GUI toolkit) represents its widget graph, and how Way Cooler (a window manager) manages window tilings.


Writing a doubly linked list is trivial if you use a backing vector and view into it. Which is how you should do it.

Linked lists are awful data structures and just as difficult to implement soundly in a thread safe context everywhere else. Rust just yells at you for doing it wrong.

I reiterate, unless you have a damn good reason, a linked list is an awful data structure. They're slow and cumbersome and have no advantages at small or large scales. The cases where you want them are few and far between and not in any undergraduate DSA course or application.


Yes, there's been lots of writing about this. I've been implementing linked lists in various ways over the last week (including right now), so this is very top of mind for me.

The core of it is two things: mindset, and how programmers think about learning languages.

Mindset: there's two things at play here. The first is that C programmers tend to reach for linked lists by default. It's a CS 101 data structure. Not super hard. A Rust programmer, however, would reach for a vector (or a deque) by default. For many use-cases, these structures are better, thanks to amortized allocations + cache reasons. As always, depends on exactly what you're doing. The second mindset issue is write vs use; C programmers tend to write their data structures themselves. Rust programmers tend to use packages. If you've determined that you need a linked list in Rust, you don't start writing one: you "use std::collections::LinkedList;". It's a doubly-linked one, so maybe that's not what you need. In that case, you go on crates.io and see if someone has already written what you need. (225 hits incidentally: https://crates.io/search?q=linked%20list) Only if that fails do you write it yourself. As the ecosystem has filled out, this happens increasingly infrequently. So "a linked list is painful to write" just doesn't apply to the vast, vast majority of Rust programmers who are simply getting work done.

Learning: because linked lists are a CS 101 data structure, they're something people reach for when learning Rust. But different languages are different, and Rust is not C. Writing a good data structure isn't a beginner Rust topic, it's an intermediate to expert one. There's stuff Rust makes easy that C makes hard as well. You can't assume that topics are equally difficult across the languages. Even then, there's http://cglab.ca/~abeinges/blah/too-many-lists/book/.

TL;DR: linked lists being painful to implement is sorta true, but also not really relevant, other than people who try to learn Rust by building them and having a bad time. (Which is unfortunate, but many people don't as well, or have at least heard that they're hard and so have the right expectations going in.)

Now, I've gotta get back to my benchmarks: std::collections::LinkedList is twice as fast as my straightforward C linked list, and I gotta figure out where I messed up. I'll check back later for replies.

(If you're interested in this benchmark I'm working on, https://godbolt.org/z/lF1D85 is the disassembly. They look so close, I have no idea why 100,000 pushes, the Rust impl is taking 7.9038 ms on average, whereas the C impl is taking 15.413 ms. That seems very wrong...)


Linked lists are great. But they have the problem that, almost always, whatever problem you're trying to solve would be better done with a regular resizable vector.

This includes problems that they should be great for, like insert into the middle or the front.

The reason is that in practice the way computers actually work is that there is an enormous time penalty for jumping around randomly in memory, and it's large enough that it's often worth paying a O(lg n) cost to switch to something that will be contiguous in memory and allow the CPU to prefetch data.

There are exceptions when you really should use an actual linked list, but your default should be a vector and not a linked list.


I would wager that linked lists are very rarely the right tool. The use case for linked lists is very narrow.

In theory a linked list has lots of attractive features as a data structure and when LISP Machines were a thing that theory translated into reality pretty well.

Today losing data locality hurts really badly. Rust does provide a linked list, for the handful of cases where that really is what you wanted, but almost always even though in theory a vector should be worse, in practice it's better.

For example if you build a toy system with a thousand small integers on the list, you might reason that removing the 500th via a vector instead of a linked list would be awful - you'd need to shuffle 500 of them forward compared to just swapping a pointer. But wait, on today's hardware the linked list approach involves about five hundred dependent cache misses. Before you find the 500th on the list to remove it, you have stalled the CPU for so long the vector implementation would have already finished doing its shuffling of 500 items forward in memory and moved on to something else.

Now, if your list has 500 million items in it, and you often want to remove just the first item while appending to the far end, a Vector is indeed a bad fit. But there are still better alternatives than a Linked List, Rust provides a Deque but unlike the one a Lisp-enthusiastic professor may have showed you in data structures class, it's actually a similar structure to a vector not a linked list.


Linked lists are one of those things, like textual protocols, that people reach for because they're easy and fun but, from an engineering standpoint, you should never, ever use unless you can provide a STRONG justification for doing so. The locality of reference characteristics of vectors mean that, except for extreme cases, they beat linked lists almost every time on modern cached CPUs.
next

Legal | privacy