Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Thanks for clarifying that. And that's great that those DB downloads are available. I didn't like the idea of scraping the data in the first place so never went that route.


sort by: page size:

I'm guessing they released their data in a different format for download? Otherwise you'd have to scrape it which is not as easy as having it in a database.

Can I download this database?

How can I download such a data base?

The author just updated the post adding a link to download the raw data.

CSV: http://files.t3hz0r.com/projects/software/sandbox/flappy/fla... MongoDB: http://files.t3hz0r.com/projects/software/sandbox/flappy/fla...


There is a torrent that lets you download the whole database. Though the structure of it and what the fields represent is not so easy to figure out.

what is a good source for this data? their terms stipulate against scraping, but I'm curious if there is an authentic "open" source for the raw data?

Thanks! Do you know if they are using data from their own servers or if they are just pulling it from the web on the fly?

StackExchange data dumps are (at least, were[1]) freely downloadable: https://meta.stackexchange.com/questions/2677/database-schem...

[1] not sure what happens since they "illegally" retrospectively relicensed all the content a few months ago.


That's interresting but it's a bit sad they don't release the data...

For people who want to explore the data directly, I published my own scraping (less complete than than theirs): https://medium.com/@dam_io/16d3670fd75b


Thanks for the tip, I just started downloading the data.

Does this mean we can download the data set?

How is this data different from what you could obtain by utilizing OpenLink Virtuoso+Sponger (Freebase cartridge)+DBPedia live?

I haven't downloaded the database myself, but I imagine if you did it wouldn't be too hard to get that data. Looks like you can get the torrent here https://laion.ai/blog/laion-400-open-dataset/

Interesting. Is this data available for download?

Where is this data available for download?

Thanks for the link. Following the link it appears that the db can is available for download from these two ftp sites :

ftp.fu-berlin.de (Germany) ftp.funet.fi (Finland)

PS: I have not verified them yet.


I'm sorry but I didn't find this entirely useful. Are they allowing us to use the raw data sources? If so, I wasn't able to download anything, just clicked through and found more documentation.

You don't necessarily need to get your hands on the actual database. You can also obtain the data through a side channel such as debug pages or errors messages being a little bit too generous with information.

Can anyone give me a quick rundown on how exactly one gains access to all of this data?

I have heard about this project numerous times, and am always dissuaded by the lack of download links/torrents/information on their homepage.

Perhaps I just don't know what I'm looking at?

next

Legal | privacy