I'd like to open source at least a portion of the data to see if it is possible to build a community driven database. Could connect works back to museums and auction records as well as articles and research and potentially build a crowd sourced provenance.
Hi, i am working on something similar and was looking for ways how i can host my open data. the approach seems interesting, can i reach out to you to discuss more on this somewhere?
Maybe open source crowdsourced data? e.g. If various people need thousands of examples of people's handwriting, someone makes an open-source repository and hopefully lots of people upload for the good of everyone.
I love the idea of these crowd-sourced systems, but is there any appetite to open source the data behind them?
In theory the collective knowledge of all of these platforms could converge on a single data-source that's freely available to analyse in any way we see fit, but more often the knowledge is behind the platform and ends up gated behind a Pro or commercial edition and the crowd is left to rebuild it again on some other platform...
There are people on reddit working on that. If you are interested, it would be good to coordinate efforts. /r/archiveteam and /r/datahoarder are the usual hangouts for that sort of project.
I wish the work you did could be open sourced, including the data set. Work like that is lost to humanity because it's done for only a few entities, and once they are being replaced, the initial work disapear with it.
I can't wait for a community-driven, open-source data model that we can take offline and plug into any software adapter. With a little version control and some voluntary data samples, a large enough community could get it going.
Do you have any thoughts about open sourcing it at some point, or charging money? Either way, I'd feel more confident putting my data in there. Otherwise, it looks awesome.
Nice job. Had a similar idea after seeing the same Tweet, but based around importing the downloaded data archive, as I’m working with that myself now. Open source, so I might see if I can weave it in when I get a chance, if nobody else does it first.
We would love to do that at some point. There is tons of open data out there, but not a lot of it has useful descriptions at the field level (only the dataset level), so it would take some time to put a collection together that is robust. Also the demo on our splash page is a demo of the artifact we create (i.e. the published dictionary) only. The other side of the web app is a management layer to bundle datasets into collections, annotate fields, configure sample sets, and share the artifacts. We'll work on fleshing out our demo to show what the system looks like when there are hundreds or thousands of datasets.
I'm a bit limited by how little time I can put on this, which is the main reason why I don't currently have any plans on open sourcing it. Like I couldn't responsibly run that project, and just dumping it on github wouldn't be useful for anyone. Besides, most value comes from the database, which sadly would be fairly harmful if it came to circulate among search engine marketers.
Some sort of collaboration might be doable, but again, I don't actually have a lot of bandwidth to actually manage it. Right now I'm doing some API-level collaboration with Teclis.
I am sort of beginning to see hardware limitations though, so if you have ideas for how to get more I'd be interested to hear what you propose.
reply