Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
What are the lesser known but cool data structures? (stackoverflow.com) similar stories update story
133.0 points by llambda | karma 58401 | avg karma 18.99 2011-12-19 12:47:04+00:00 | hide | past | favorite | 57 comments



view as:

Working with CouchDB has taught me about append-only btrees, which are quite simple but rather neat:

http://www.bzero.se/ldapd/btree.html


Already discussed here : http://news.ycombinator.com/item?id=1370847 and I think there was a more recent repost but I cant seem to find it.

I wish they wouldn't close questions like this.

Considering that they don't really line up with the goals of a "normal" SO question, I can't really blame them for closing it. It may be better for them to create a separate section for these types of open-ended questions.

I thought they were encouraging such questions to be posted on http://programmers.stackexchange.com/ instead.

I don't think so, from the FAQ:

    You should only ask practical, answerable questions based on actual
    problems that you face. Chatty, open-ended questions diminish the 
    usefulness of our site and push other questions off the front page.
Programmers is simply for less focused on code, more conceptual questions.

Well then they're missing out. The only time I visit Stack Overflow is when it comes up in a Google search or when it shows up on hacker news with something actually useful and book-markable like this.

Programmers mod here: this is inaccurate, as these types of questions are pretty much off-topic everywhere on Stack Exchange.

Programmers is the whiteboard to Stack Overflow's compiler: if you have a specific (emphasis on specific) programming issue that doesn't involve code, it's likely on-topic for Programmers. If it's just a poll of weird/funny/cool/etc. stuff, it's not welcome.


facist moderators who delete the site's best questions make it worse one deletion at a time.

It's too bad the name is so generic, or that it makes such a fine distinction between programmers and programming.

I agree it's not intuitive, particularly to those not well-versed in the history of Stack Overflow.

For the backstory, when there was just Stack Overflow (and not the network of Stack Exchange sites), there was a large contingent of people who wanted Stack Overflow just for programming problems (and not programmer-related questions, like business concerns, conceptualizing, or lists like these).

So the early Stack Overflow population separated everything into "Programming Related" (on-topic) and "Not Programming Related" (off-topic) questions. When Stack Exchange 2.0 launched (allowing people to suggest new sites), one of the sites that launched in the first wave was "Not Programming Related" which was intended to house all the fun stuff that was now off-topic on Stack Overflow.

Turns out, having a free-for-all site doesn't work, and the quality was all over the place. So it was retuned into being a site for questions about being a programmer or acting as programmer (so, business or conceptual questions answerable by programmers)—which captured 85-90% of the quality questions on the new site—to Stack Overflow's concrete programming (questions specifically about implementation): hence the name, Programmers.SE.


No, Programmers is for strict Q&A as well.

I am on record on meta.programmers.se as disagreeing with that policy.


SO is quickly approaching reference territory. It is my go-to authority on most programmatic questions.

And not all content takes the form of a how-do-i-solve-this-problem format. Would be good to see them re-purpose that good information (like this particular entry) into a useful format.


> Would be good to see them re-purpose that good information (like this particular entry) into a useful format.

Like cough Wikipedia? Most of the entries on that page are links to a Wikipedia article with a one-paragraph summary. Seems like anything missing from the canonical list should just be added:

http://en.wikipedia.org/wiki/List_of_data_structures


Wikipedia is a good place for information about different data structures, but the question here is "What are the lesser known but cool data structures. There are tons of obscure data structures, but only a few that people would consider cool. Wikipedia cannot and should not show that information, because it's inherently subjective.

StackOverflow cannot show that information either, because it's inherently subjective!

"What are some data structures not found on Wikipedia" is an objective question (but time-bound, which violates an SO policy)


Wikipedia was my first thought, but it's more encyclopedia oriented (naturally).

I was thinking something more along the lines of what they did with Scala:

http://stackoverflow.com/tags/scala/info


I am not optimistic that they will reach the point of being a reference. I fear that, especially for product-specific information, it will eventually become too hard to separate up-to-date content from outdated content.

I do not know a good way to prevent that, other than through manual intervention (hm, thinking of it, machine learning could help here. One could train a model that knows, for example, that 'perl' answers from five years ago are more up-to-date than 'rails' ones from two years ago)


speculation 1) its a doozey on their database

speculation 2) they don't make money on content like this. (um, maybe its significantly less referrers from search?)


I think that's unfair, the mod who closed it isn't even an SO employee.

The SO employees have seen it. If they want questions like this open, they all have the power to override moderators.

Maybe they simply agree the question doesn't fit the SE model? I'm not saying the company running SO doesn't want it closed, I'm saying it's unfair to assume that's because it's "inconvenient" to them.

I don't follow you. I'm not assuming it's inconvenient for them. It's very convenient. They just have to click a few buttons to reopen a question like that. I know for a fact that they really don't want them to remain open.

We know they don't want it open. The question is why.

dustingetz was speculating they don't want it open because it either hits too much on their database or doesn't make enough money from it.

My position is that they feel the question doesn't fit with the SE model and there's no ulterior/egotistical reason behind keeping it closed.


Oh, I see what you mean now. I mistook your original comment as being in agreement with him then. I thought you were saying that it was unfair for the moderator to close that question.

No, it's closed because it's a list question and there is no actual answer.

Skip lists is such one that I wasn't taught in school. I did not learn about them until a friend brought them up in a conversation. They are probably taught in upper level algorithm courses.

There were some interesting additional examples (and commentary) in the HN discussion when this was last linked (18 mos ago): http://news.ycombinator.com/item?id=1370847

Hey a good startup idea. A Vote based web site for subjects covered in `non constructive Stackoverflow entries`..

I like the part where the most informative, educational, and interesting discussions on StackOverflow are inevitably closed as "Not Constructive".

Stack Overflow isn't a discussion board. It's a Q&A site. It's not intended for neverending lists that don't have a correct answer.

Is that so? Are the "benefits of initiating followers into the Blades of Skyrim?" more helpful or factually verifiable than a list of lesser known data structures?

http://gaming.stackexchange.com/questions/41411/whats-the-be...


That has 2 answers. The datastructures question got closed after 89 (some of them duplicates). Plus they're on two different sites. I don't get the point of your reference.

My point was that I'd appreciate, at the very least, a StackExchange site for this type of question.

You can suggest one on Area 51 (http://area51.stackexchange.com/) but I doubt it will get much traction. The Stack Exchange engine is much better for objective questions that have a correct answer than subjective questions that have a lot of different answers. Hacker News and reddit are much better for discussion topics.

I would argue that Hacker News and Reddit 'engines' are only good for transient content: interesting today, forgotten tomorrow.

The 'culture' on the other hands, on Hacker News and Reddit are more suited.

Taking culture into account, I think good homes for information are.

Transient, subjective: forum Transient, objective: heavily moderated forum. Not perfect fit though. Permanent, subjective: ??? There are some sites for this, but, I think they work pretty suboptimally. How do you keep the idiots out? Permanent, objective: A 'pedia.


Good point about the transient nature of HN and reddit. They're really not suited to be a permanent home for the kinds of lists that people want to create on Stack Overflow. All three sites have the kind of community focus that can be used to create the lists, but they really need to be moved to Wikipedia for proper care and feeding once created.

The SO software is actually quite good for accumulating lists. It's the SO community that does not allow these questions. From what I can tell the SO moderators are evangelically convinced that any question involving an opinion (other than Jeff or Joel's) is the slippery slope to death by tragedy of the commons.

It's like someone read Clay Shirky as Atkin's diet.


The SO software is terrible at accumulating lists. Count the duplicate entries in the datastructure question. Also, which one of those answers is the correct one? That's what the SO software is designed to help us find, and it fails here, as it does with any question involving an opinion (Jeff's, Joel's, or anyone else's).

"Which of those answers is the correct one?" is not a problem if you are soliciting a list ... that's sometimes the point of the question.

"The single right answer is what SO was designed to help us find" is precisely the community norm that I find less desirable. It wasn't there at the start, it is there now. That community norm was a product decision by Jeff (and possibly Joel). It has since been taken to extremes due to the self re-enforcing feedback loop of the community. It's treated as a sort of born-again revealed wisdom by the group.

One could claim this limitation is the required tradeoff to attract the group that manages the site. That might be true ... but I don't blame the software.

Did the software close Alan Kay's question as not constructive or was it a pair of moderators?

What does non-constructive say:

"This question is not a good fit to our Q&A format. We expect answers to generally involve facts, references, or specific expertise; this question will likely solicit opinion, debate, arguments, polling, or extended discussion. " Except answers to Kay's question do involve facts, reference, and specific expertise, as well as opinion, and debate. Is the question really non-constructive ... or is was it actually closed as a side effect of the community norm?

The community norm is an over the top version of Godwin's law that opinion is a slippery slope to group death. It makes for a more focused product. The creators read this: http://shirky.com/writings/group_enemy.html Then cut the baby in half and kept the profitable half.

I'm claiming part of what they threw away makes SO less pleasant for me to use as a contributor. I actually don't like that I contributed to something that treats debate and opinion as anathema.


It will most likely never happen. Mods and the community have a hard time as it is maintaining a base standard of what defines a SE Q&A site. The point of StackExchange was to create sites that did not duplicate already existing discussion forums.

As communities grow they learn and change the scope and that's what has happened here.

The current list questions that are open are under close scrutiny by a few community members. For example http://stackoverflow.com/questions/194812/list-of-freely-ava... is cleaned regularly for duplicates, arranged by language and alphabetically sorted.

A site like the one you would wish for would be highly popular but no user would ever want to clean it up, instead they will add another duplicate answer that may have been mentioned 1-2 years earlier.


That's not StackOverflow. That's StackExchange Gaming.

Which variables get set when you do X is fairly factually verifiable. How do you factually verify a data structure is cool? More helpful? What is the task at hand? If I am playing Skyrim and trying to decide weather or not to initiate followers into the Blades of Skyrim, then the link you posted is most helpful. On the other hand I am not sure what task I would be undertaking where a bunch of data structures whose only commonality is obscurity would be truly helpful.

I think my biggest problem is the wording. "Closed as Not Constructive" looks plain silly when juxtaposed to such obviously good content.

I think they need something like "Closed, 'good enough already'".


That's true, there are several old questions that have some pretty awesome content and that close reason is just a bad description. It looks like one of the other moderators agreed with you too, since the status of the data structures question is now changed to locked:

> This question exists because it has historical significance, but it is not considered a good, on-topic question for this site, so please do not use it as evidence that you can ask similar questions here.


I'd like to add a similar question: what are the lesser known but cool control structures?

Duff's device! :-)

- Coroutines

- Functions with multiple entry points

- Self-modifying code

- Computed goto

- Multiple dispatch

- Restarts

- Continuations

- "Split execution" as in CoreWars


As someone who dabbled in SO I quickly became disillusioned with the site after I posted a question on LaTeX and quickly had it closed and then was shuttled over to the "new" LaTeX site. IMHO, If the question doesn't fit the current policy don't make the submitter waste his time posting the same material on 5 different SO sites in order to get a hit. Find a way to transfer it where the conversation can continue to flourish.

I don't seem to understand what you are saying. Answers and comments are transferred when a question is migrated. In all cases where users have a problem it is best to raise your issue on meta.stackoverflow.com or at least reference the question you are talking about above.

Most LaTeX question are not programming questions, so they get shipped over to the TeX site. You'll likely get a better answer there than on SO (that's the whole point of creating all these sites). And as phwd said, all the content (answers and comments) also get moved.

> Find a way to transfer it where the conversation can continue to flourish.

That's exactly what happened. 99% of questions get answered on the LaTeX site. It's one of the best communities in the SE network. Compare that to only about 90% of questions tagged 'latex' getting answered on Stack Overflow. Not bad, but still better to migrate them to an active site dedicated to the topic.


Less known data structures come to mind:

- Topological sort. Great for dependency traversal.

- KD-Tree. Great for geo-spatial search.

- Consistent Hashing. Great for scaling and clustering.

- Fuzzy Hashing. Great for document digest and mining. Sadly there might be a patent conflict somewhere.

- Extendible Hashtable. Amazing space-efficient and extendible lookup table.

- Bloom Filter. Space-efficient check.


A favorite answer to this (recurring) question: aguri trees. I've written about them before:

http://hackerne.ws/item?id=101969 (the blog post is dead, but you can find some of it at bit.ly/tN2ESX)

For data sets with binary keys in which prefixes are meaningful (most notably: IP addresses, but also memory addresses and probably many integer keys that have natural range semantics), Aguri is a radix trie implementation that exploits tree structure to infer ranges in the data.


Far more courteous disagreement back in the day ;)

Legal | privacy