Hacker Read

ShamelessC · 2023-07-11 03:33:55

Oh I’m happy to admit if I’m wrong in the details. My bad.

So you’re saying the experts chosen are a more literal mixture of layers from each model? Rather than a simple “pick which model to run”?

reply

michaelt | karma 31037 | avg karma 4.1 · | 2024-03-04 15:14:50

Then you might enjoy looking up the "Mixture of Experts" model design.

viraptor | karma 41139 | avg karma 2.79 · | 2019-05-21 21:54:47+00:00

Sure, there's lots of models to choose from and potential for analysis paralysis. That's not what I understand the OP was complaining about though.

_ntka | karma 4749 | avg karma 279.35 · | 2015-11-22 22:08:27+00:00

Agreed, isolating the contribution of the "model" itself can be borderline impossible, and the details of model engineering have a considerable impact on performance.

That's why I like Kaggle competitions: most approaches tried on a given challenge will likely be optimized to their very limits, so that it's the nature of the different approaches that ends up making the difference.

reply

chrisbrandow | karma 864 | avg karma 2.72 · | 2019-12-05 05:57:49+00:00

I assume I’m not the first making this point, but I think they’re evaluating all the models, not just the ones that they know did well.

tinus_hn | karma 7585 | avg karma 1.16 · | 2017-08-07 19:03:05

I'm not so sure deciding things based on a hard coded list of models is such a clever idea. It smells like the UserAgent header on the web.

mijoharas | karma 1986 | avg karma 2.18 · | 2020-11-25 13:57:10+00:00

I'm just fairly skeptical of this comment.

I understand that there are more models being thrown at the solution space, but if you think about people at jane street, while they don't have 3000 employees working on these, they've got a few hundred, and then if you think about the _time_ they spend on (the real, not this kaggle competition simplified stuff) models, I think there's a good chance that they'll have a significant advantage in man-hours (it's a full time job).

Also, while there is a high standard to people doing kaggle competitions, I think you can easily discard >90% of them as not being super competitive, then your 3000 models becomes 300, and those 300 have less than 1/3 of the time spent on them that jane street-ers are spending on their models.

(I realise it's not an apples to apples comparison, as the models they're working on are different since this is a toy example, but you said "will outperform anything that Jane Street _could_ come up with)

reply

AshamedCaptain | karma 6226 | avg karma 3.11 · | 2023-04-29 06:36:37

I don't think you understand the point. Your claims that "all of this needs extensive tuning and hand-holding and picking results" do not help your argument, they help _mine_.

Most egregious if you are even doing more tuning and cherry picking that the authors of the models are doing, which you definitely are.

reply

akkartik | karma 13864 | avg karma 3.99 · | 2016-09-19 03:58:00

Yes. They pick codebases that they think have good style to train the model on. Seems reasonable.

skilled | karma 18165 | avg karma 5.67 · | 2023-09-27 13:54:37

No, they came up with all the data for their models themselves.

mikro2nd | karma 5487 | avg karma 4.22 · | 2022-02-01 00:59:39

Do you suppose that modellers are unaware of the differences?

carterschonwald | karma 3821 | avg karma 2.74 · | 2012-07-07 13:31:50+00:00

I'm inclined to agree! The way they structure the competitions does not allow users to become sophisticated modelers!

mesh | karma 719 | avg karma 5.41 · | 2023-05-04 17:28:40

>The point I'm wanting to make is that users will go to whoever has the best model.

Best isn't defined just by quality though. In some instances for some groups, things like whether the model is trained on licensed content (with permission) and / or is safe for commercial use is more important.

This is one reason why Adobe's Firefly has been received relatively well. (I work for Adobe).

reply

vikramkr | karma 5914 | avg karma 3.39 · | 2019-07-21 17:25:19

Why is too few models the issue? Wouldn't the problem be that the existing models arent good enough? Its not clear to me how an increased diversity in models addresses the problem - at the end if the day a form has to pick one to make any particular decision. If each firm is using a different model in the name of diversity, then while the few firms with more accurate models potentially do well the subpar models will underperform in general and probably increase the overall risk level of the industry. Is there something I'm overlooking from my skin through the article here?

fixxer | karma 2376 | avg karma 2.57 · | 2017-02-14 05:31:33+00:00

There are people submitting a ton of models that are all very similar, basically overfitting the public board.

heipei | karma 2427 | avg karma 5.71 · | 2020-11-24 19:07:27+00:00

True, and I don't think they expect models that are superior to their own, I'd look at this as a hiring / marketing tool. Plus, even if you had a model that from a pure engineering standpoint was able to match Jane Street's approach, the model would not work without the wealth of proprietary (and expensive) data sources that Jane Street is sure to ingest, so you still couldn't just go out and do it yourself without some serious upfront investment first to get the same data. That is assuming all data they use is even available commercially, which I doubt as well. There are probably data sources that only become available to you through personal relationships with the right folks at the right places.

Oioioioiio | karma 21 | avg karma 0.64 · | 2024-03-19 09:20:29

There are AI models who can create proper meshes though.

rvnx | karma 6612 | avg karma 3.43 · | 2023-03-04 04:41:37

What you described with regards to the model is exactly why this person is talented.

It takes a lot of time to find the right model and settings with AI.

reply

nerdponx | karma 22397 | avg karma 2.51 · | 2022-05-17 08:26:39

Then problem is that human experts sometimes can't tell the difference while the model can.

maxerickson | karma 36996 | avg karma 2.02 · | 2023-01-10 05:12:26

There is a sense in which the person/people who select the training data for a model are the artist. Or at least, a sense in which they can be an artist (models that just train on as much data as possible don't fit).