Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

It seems to me that this would be a fair strategy (the immediate rejection or acceptance part, that is) if N is very large.

But for smaller and more manageable N, I would assume you're better off with just ranking them all, and work out some decision threshold afterwards - when you have all the information at hand.



sort by: page size:

TL;DR: this works in a scenario where you must immediately choose one from a series of randomly ordered candidates of a known quantity. To optimize the probability that your choice is the best candidate:

1. reject the first ˜ 37% of candidates; 2. choose the subsequent candidate that is better then any seen so far.


I saw a "how do you make a good pick from an unknown selection?" article a couple of decades ago in the Observer which basically boiled down to "Always reject the first item; then pick the one which isn't glaringly worse."

Seems reasonably sound, statistically, based off some Monte Carlo simulations I did.


Well, this strategy would only help you find ones that are not over loaded? And probably only works if you can change choices over time?

That said, sampling, in general, works surprisingly well in all things.


Right, we’re not dealing with banks here. Ultimately we’re picking between N alternatives. The size of the gap between them doesn’t really matter, as long as it preserves the correct ordering of winners. Given that, using statistics to reduce your work is a completely reasonable efficiency here.

this also assumes every candidate is fully willing to accept the offer. what if your first best candidate accepted the offer but was in the 36%. Then the 2nd best candidate after 36% rejected the offer, leaving you with potentially a bad choice in candidates. This seems highly probable too...

If everyone did this, you have to assume the other candidate is doing the same, thus your choice after 36% may end with you being insider their 36% bracket. This works only at a selfish scale and dubious at best.

This only really works for a gameshow situation. You have 20 boxes with money inside. You don't know what is the minimum or maximum amount, you get to look at a box and determine yes/no. So you do the 36%. Then the very next box that is close enough or above the maximum of the first 36% is what you chose.


It won't give you the best odds.

A trivially constructed strategy with higher expected value: A surprising number of people choose 1, 2, 3, 4, 5, 6 as their numbers. So, instead of a strategy of using Quick Pick, you could use a strategy of starting with Quick Pick and discarding the result until it's something other than 1, 2, 3, 4, 5, 6, which you then use. That has a higher expected value than Quick Pick by itself. The number of people using this strategy is dwarfed by the number of people using the "1, 2, 3, 4, 5, 6" strategy.

This could be expanded to avoiding months, birthdays, etc.


Selecting twice as many candidates as you need and then choosing half of them randomly might actually be a good idea, just so everyone knows that you're there in part because you got lucky.

(Also, tracking what happens to each group might result in some interesting data.)


Another option is to do N sequential candidate tests for each fresh random candidate. Where N is something that gets you most of the speed advantage of doing all sequential tests. Maybe like N=16.

Nice write-up!


Just forget about trying to be fair: use some kind of test and pick out by lottery from the n best results.

Right, that works; but it seems different from "Pick one from first 10, pick one from next 100, pick one from next 1000".

In my experience using tournament selection with a small elitism count works best. You keep the best `n` members of the population and for all other members you group them into random groups of size `m` and select the best value from that group.

So why not try random instead?

It sounds like a great opportunity for an experiment. Say you have 20 slots, pick 10 using the GPA/nitpick method, and 10 at random, and then compare the results 5 or 10 years later.


Some in your position would, some wouldn't. It's effectively stochastic sampled ranked choice across the population.

Yeah, 36% chance of getting nothing isn't great. I'd expect that for the last third, you might want to settle for someone who is a bit worse than the best of the test group.

This algorithm is only good if you'd rather have nothing than second best.


You don't need to know number of candidates if you know the total amount of time (and assume candidates come in uniformly at random).

The most practical use case is selling a house.

There are also variations of the problem which model being able to go back to previously rejected candidates, as well as a probability that the candidate rejects you.


Lottery systems with an initial cutoff make sense if you have a large surplus of applicants who surpass your lower quality threshold. Rather than make asinine minuscule decisions over who exactly the top 1,526 out of 10,000 are, cull the 7,234 that don’t make the cut and randomly select from the survivors.

I tend to advocate methods that take into account the variance of the belief in order to minimize the risk of showing bad stuff at the top of the heap.

Penalizing variance would be the opposite of my intuition. Given a boring low-variance item with 10 3-star votes, and a divisive item with 5 1-star votes and 5 5-star votes, I'd think you'd want the one at the top to be the one with the medium chance that they'll "love" it than a high chance they'll find it passable.

If you further assume that the average person is going to check out the top few results but only "buy" if they find something they really like, the risky approach seems even more appealing. A list topped by known mediocre choices has a low chance of "success". What's the scenario you are envisioning?


That would make it very easy to game the system. What you should do is pick the four you like best, even though it puts you on the spot.

We could assume that each student has as close to perfect knowledge about his/her preferences as is attainable. The student will have to work to communicate that to the system one way or another -- I'm suggesting that using several rounds might mean less work (and/or less unpleasant work) for the students, than if they have to fill out an impossible number of choices/preferences upfront -- just so the system have the data it needs to complete in one round. Which would be optimizing the wrong part of the problem (or even the wrong problem).

No matter how interesting the problem might be from an algorithmic point of view.

next

Legal | privacy