Your problem is that you viewed it as an internal search, when it was designed and advertised as a global search.
If you want to argue that the global search should have been an internal search, that's fine. If you imply that the search was an internal search, though, then that's misleading FUD.
> I took a look at the data. The data schema is disorganized to the point that a lot of janitorial work would be necessary to get it useable and perform any analysis or visualization.
> Why not collect all the information you possibly can, whether or not you immediately see obvious value to it, and then find ways to justify it later?
Because it's non-trivial and it costs a lot of money to do that, in terms of engineers' salaries, storage, etc.
> PS: The only way a scientist is overwhelmed with information is when they need to do something by hand. It's hard to guess how a scientist would like to collect less information. Worst case you ignore it because you can't process it yet.
I am still trying to parse that statement. 12 years ago we needed a 250 node cluster to get anything done with all the data that was presented to us and that's a fraction of what's being generated today
>We tried A x B in this way, that way, some other way, none of them worked
How searchable is this data? Like, do I need to be an expert who is up-to-date on most proceedings in the subfield to know this, or is this information easy to pull up with a few searches?
> Just because you conceptualize it in your mental model does not mean you need a graph database.
Yes! When I was younger I worked on a problem once that needed to compute some very basic graph metrics. My seniors tried to do the work in an early graph database and it was a disaster. It turns out literally just reading in the lines from a file and counting things got the job done in a few seconds.
They refused to use the results until they were coming out of the graph database because "just in case we needed other metrics". We never needed the other metrics.
That would be datamining done wrong. Its perfectly fine to look at data to provoke new hypothesis. But you should not be using the same data to confirm the hypothesis that it provoked. Either use fresh data or make sure that you still ensure correctness if you are reusing the data.
How is it unnecessary? They used none of those methods, so they had to use search. That is search being necessary, not the opposite.
reply