> what is the purpose of offering your anecdotal experience if you acknowledge that it is an n=1 datapoint?
What is the purpose of using large-n datapoints anymore when the primary motivation of "research" is becoming less about finding meaningful information and more about getting published (in for-profit, paywalled "journals") so you can get more grant money for your institution?
Haven't you seen the articles in the last year or about little scientific research is even reproducible anymore ? You can just repeat your experiment until you get the results you want to see.
> I still think that science should be automated in some way, shape or form.
Meaning what?
> I'm imagining something like you have a dataset and you have to upload that dataset to some third party that checks it for it's validity.
The results are what they are. What is 'validity' meant to mean?
> Of course, this is a completely silly idea but I'd love to know if someone has like any tangential related thoughts on this
Quantitative studies are already published with the proper analyses, which are invariably produced 'automatically' using software, not manual methods.
I imagine there might be some value in publishing raw data, though. There may sometimes be questions like privacy, but I don't imagine they'll always be show-stoppers.
>From my understanding, you can't draw any real conclusions from the data unless you predicted that it would look that way beforehand.
Oh my sweet summer child, a lot of lab work and data collection is expensive and in the game of research, you spend a lot of time gaming the system and meeting expectations relative to doing actual fundamental research. So much work wants to take the hard low return work like doing tests and collecting data and releveraging it with increasingly complex statistical approaches.
I've worked in so many research environments that you find more often the case is that there's a selection towards research that could be pursued and falsified with existing data vs the other way around. Here's this set of data and how it was collected, what arbitrarily new novel thing can we say about it? It may not be something interesting but it may be statistically or theoretically valid. The result is you get a paper/publication out of it without doing the footwork.
This is part of the reason researchers often hold their data tightly. You'd think scientists would want to share data but it's a highly competitive environment and if you took the risk to invest time and money in some costly data collection process, you want to do everything you can to say everything you can about it before someone else does it without any of the underlying cost. Sure, you may get a reference or footnote for your data but that's not going to help that much in the big scheme of things, not as much as a fresh publication. Also, if you're only being referenced for the data collection portion of your work... it doesn't speak alot about the work you did around that data collection.
> I disagree, the data exists, this is fundamental, we are not going towards data impoverishment, we are necessarily living in an environment rich with data. I rather have accurate data by far.
It may be an inexorable trend, but for the time being, individuals and organizations generally have a choice about the amount of data they collect.
> I took a look at the data. The data schema is disorganized to the point that a lot of janitorial work would be necessary to get it useable and perform any analysis or visualization.
Analytical data on which problems you're solving and how you're doing vs how you think you're doing? And who might be interested in purchasing that data...
> No, all the choices have already been made. The data have already been destroyed
You don't know that. You'd need an investigation to conclude that. Using it as an excuse not to investigate seems like assuming the conclusion.
Even if some data have been destroyed, it doesn't follow that every last piece of data everywhere has. Who knows what might turn out to be significant? There may well be relevant data in many countries, too, since the research was international.
> OTOH: you're presumably holding the non-aggregated data for aggregation purposes in the first place. IANAL but I think that needs consent.
I believe this is actually fine if they can show they don't hold this data longer than necessary and have a process for destroying it in a timely fashion.
> The fact is, most people aren't qualified to interpret the data.
so what if most people isn't qualified? Data is data, and can be used - even for crackpots who wants to use it. If these crackpots publishes something wrong, i m sure they'd be pointed out and either ignored or shunned by the publisher(s) anyway.
> Am I the only one who finds it concerning that they capture like everything and persisting it just for the sake of having it and finding a use for it later?
I am pretty sure this has been a standard practice for at least a decade now. Isn't that what the "big data" meme is about? Store everything, because you can always get more computational power and statistical techniques to extract value from it later on.
> However, researchers (not me) are already looking for ways to extract data from the libraries. I think it's only a matter of time before this becomes a much bigger problem.
What's the use of the data, and what problems do you see?
Observation in research support, I'd guess. It typically no longer seems to be the case that you do whatever you need to for your data.
reply