Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> Where’s the analysis?

Buried in over a million graves.



sort by: page size:

> I took a look at the data. The data schema is disorganized to the point that a lot of janitorial work would be necessary to get it useable and perform any analysis or visualization.

In other words, it is real world data.


> You can quote a million papers from your cursory Google Books search

Still better data than the literally nothing that you brought to the discussion so far.


> If this isn’t maliciousness or incompetence, what is it?

Expensive to calculate, for something nobody really cares about anymore.


> Wow! That's a whole lot you've managed to unpack!!

Huh? They just repeated two things from you and asked if they interpreted them correctly. That's almost the minimum analysis someone could do.


>>> based on an analysis of all the publicly available data This aim is not as ambitious as it sounds, because there is not a great deal of data publicly available.

Amusing and revealing at the same time. Bet they did not take long to agree on that senrence


> Not saying its a bad idea, but I'm still struggling to see the full logic of it.

You don't have to see the full logic of it! There's data collected and analyzed by people who have studied the problem!


> People are getting confused and that's a problem.

Let's see the data.


>Not a whole lot of thought went into the validity of the data or its analysis.

They may not be able to analyze the data themselves, but there are third party companies which have the expertise to make something of that data.


>> So you are certainly aware that there are avenues to creating the data set. Given that, it is quite reasonable to say that search is unnecessary.

How is it unnecessary? They used none of those methods, so they had to use search. That is search being necessary, not the opposite.


>I think you're over-estimating the amount of data involved.

The one billion figure was from the article.


> Why are stats like that not shared? That seems like very valuable information.

Because it's only 163, and data of an unknown quality.


> It’s astounding the correlation

Can you post your research? How many millions of accounts did you analyse? What tools did you use?


> No, all the choices have already been made. The data have already been destroyed

You don't know that. You'd need an investigation to conclude that. Using it as an excuse not to investigate seems like assuming the conclusion.

Even if some data have been destroyed, it doesn't follow that every last piece of data everywhere has. Who knows what might turn out to be significant? There may well be relevant data in many countries, too, since the research was international.


> I'd even say that even collecting this data is wrong

Without this data they can't tell what's worth making and what's not worth making.


> presumably

that's the whole point. everyone seem to presume, nobody seems to bring data. few special cases here, few special cases there.


> The giant piles of dirty data are that way because for thirty years no one has considered them worth cleaning up. How will they create such astounding amounts of unexpected value?

It was left that way because we didn't have the tools to process it. Imagine the amount of unprocessed video data that we can now annotate pretty accurately. What's the value of that data now?


>As a data source for academic study it's unparalleled.

Does that actually mean it's valuable? Just because something's one of a kind doesn't mean it's worth that much.


> , few people have the time or want to make the effort to comb through and analyze original sources.

I mean, unless it's your profession, you're not. At best, you're reading an article (with summarized data that you hope was aggregated correctly) in a journal. To the best of my knowledge, the raw datasets that those are based on are rarely shared.


> I have a pet theory that a large portion of the decay in our society can be traced back to the layers of abstraction between stakeholders and the things (and people) they have power over.

My pet theory is a specific instance of yours: MBAs wielding spreadsheets (and, more generally, analysts wielding databases) is the abstraction layer.

next

Legal | privacy