Hacker Read

bitL · 2018-05-30 16:35:10

What would be the good metrics then? Of course metrics are just indicators that can be interpreted incorrectly. Still, we have to measure something tangible. What would you propose? I am aware of limitations and would gladly use something better...

Some people mention Matthews correlation coefficients, Youden's J statistic, Cohen's kappa etc. but I haven't seen them in any Deep Learning paper so far and I bet they have large blindspots as well.

reply

samuelhulick | karma 232 | avg karma 1.46 · | 2015-12-14 22:12:25+00:00

Hmm, any suggestions on what those metrics would be?

s73v3r_ | karma 1597 | avg karma 0.61 · | 2018-03-22 12:37:47

What would those metrics be?

netcraft | karma 2169 | avg karma 2.92 · | 2014-06-12 13:33:15+00:00

What other metrics would you propose to use? (genuinely curious)

staltz | karma 2560 | avg karma 11.85 · | 2015-10-06 13:17:32

What metrics do you suggest?

AznHisoka | karma 7533 | avg karma 1.92 · | 2017-10-25 23:36:35

so which metrics are the right ones to look at?

karmelapple | karma 2246 | avg karma 2.65 · | 2016-01-01 02:08:26+00:00

What is a good metric? How can we come up with good metrics?

LAC-Tech | karma 6383 | avg karma 2.95 · | 2023-02-01 16:03:22

Fair point. I struggle to find metrics myself.

thebruce87m | karma 1640 | avg karma 1.48 · | 2021-08-28 14:05:14

What would you suggest as a better metric?

blhack | karma 24926 | avg karma 6.55 · | 2021-02-01 18:52:05

I'm curious what metrics we should track to determine if this is good or not.

AStellersSeaCow | karma 512 | avg karma 10.24 · | 2022-12-17 12:09:30

Counterpoint: you don't have enough good metrics.

atrocious | karma 15 | avg karma 3.0 · | 2018-12-27 22:35:21+00:00

Which metrics?

jumelles | karma 1489 | avg karma 3.35 · | 2024-01-16 23:17:43

Which metrics?

mlthoughts2018 | karma 4165 | avg karma 1.32 · | 2019-10-19 16:56:15

The problem with discussions like this is that they never provide systematic examples of how a portfolio of metrics or qualitative checking can be integrated into a modeling problem. There’s a lot of finger pointing at metrics and complacency about problems, but the solutions are super vague, like the sanctimonious passage in this article about hiring from under-indexed groups in tech companies and just listening to first-person accounts (which is probably a bad idea if you actually want to help).

Ultimately I agree with the underlying idea, but I think to be helpful you have to present case studies of reproducing research but with metric optimization swapped out for a holistic variety of metrics plus qualitative checking.

I recommend the books Bayesian Data Analysis by Gelman et al and Data Analysis Using Regression and Multilevel/Hierarchical Models by Gelman and Hill if you want to read good accounts of doing this in practice with real data sets.

There’s definitely room for a book like this that focuses on more domain specific models in NLP, computer vision and deep neural networks.

reply

jbay808 | karma 7260 | avg karma 3.72 · | 2019-04-01 16:53:42+00:00

Can you provide an example of a metric that meets this criteria?

bumby | karma 6991 | avg karma 1.44 · | 2022-11-06 18:21:24

What would you advocate as the best metrics across those domains?

xboxnolifes | karma 4360 | avg karma 1.98 · | 2022-05-30 15:37:30

All metrics are made up. Some of them are useful.

noodle | karma 6768 | avg karma 2.55 · | 2009-04-15 20:06:31+00:00

righto. if you're looking for input on what metric should be used, i'm afraid you'd have to pony up more information. target the least subjective and least manipulatable metrics possible. hopefully things you can also access and objectively measure on your own.

wjn0 | karma 348 | avg karma 2.38 · | 2018-10-04 07:31:49

I think it's important to be careful with the vocabulary here (if only because of the site we're on).

There can be metrics that are targets which still remain good metrics. For example, in many machine learning competitions, the submissions optimize a known, given metric; but the test data is not known. Therefore, it is still a good metric.

I agree with the sentiment in this case, though.

reply

intended | karma 5366 | avg karma 2.01 · | 2021-07-30 01:21:19

Oh yes please. Metrics, anyone.