Hacker Read

mikewarot · 2022-12-27 23:16:13

It used to be that we bought books, and they accumulated (and we sometimes even used them again) until your heirs had to get rid of them. Now we read things online, and the disks are wiped when our heirs get rid of them. We've always been consuming more than we're producing.

What would be nice is to be able to ingest content as before, but have it all stored automatically, then made searchable. I'm told that this is available[1] for a Mac based machine. I'd like it more generally available.

The point of having books, or notes, is to be able to retrieve information with context. We now have tools that can summarize and categorize things automatically. If we, as a hive mind, can add just a little bit of multidimensional voting/tagging to the things we see, along with their cryptographic checksum and consent mechanisms, we could collectively organize everything we see with almost zero individual effort.

Vannevar Bush's vision for the Memex runs afoul of copyright laws, this might be a way to route around them a bit.

[1] https://www.rewind.ai/

reply

mikewarot | karma 10851 | avg karma 2.02 · | 2022-08-13 01:07:32

It's good to see you here, Doc.

I think that for a while, the distribution model of choice is about to go underground for a lot of topics, due to the weird politics of the day. Preference falsification is in high gear now.

Think Memex, the 1945 idea, where you stored everything you ever read locally, and could spool of a dump of all of it (with context, annotation trails, etc) with no need for the internet to grant you that access again.

This gives the finger to the ideas of copyright, intellectual property, and all manner of gatekeeping. It might be the only way for free thinking people to have free discussions for the next decade or so.

reply

jean_claude | karma 56 | avg karma 1.47 · | 2016-06-27 01:03:41

While there may be some power-mad librarians out there in terms of 'there will be silence in the library!', they curate the library's collection for many reasons: to meet the requirements of stakeholders, to meet the needs and shifting interests of the community, to make room for new material...

Really, there are a whole host of reasons for deaccession of library materials that include financial, space, and other concerns that hit the bottom line.

The great thing about technology is we have the ability to collect, classify, add metadata[1], and make all human knowledge accessible to everyone. If only insanely long copyrights and restrictions on scanning and lending copyrighted works were not in the way.

[1] This is not really something a busy librarian has time for, so other methods, such as user-driven tagging and folksonomies[2] are an interesting solution that is being tried.

[2] http://interactivearchivist.archivists.org/technologies/tagg...

reply

thread_id | karma 1651 | avg karma 5.63 · | 2020-05-24 00:14:05+00:00

I would really benefit from this product. I'm working across three computers and I have book marks all over the place. A central repository that enhances searching and categorization would be awsome.

sheraz | karma 1309 | avg karma 1.84 · | 2017-01-02 19:34:40

Good story. I completely agree that all of this auto-curation will bury the eclectic collections and poison discovery.

That is one big reason I started my side project, curabase.com[1], to simply enable anyone to curate their own list of bookmarks. (Not a new idea, but it is MY idea based on this singlular thought -- that human curation will always be better, and NO. I don't have data to back it up :-)

reply

mat_jack1 | karma 96 | avg karma 3.2 · | 2022-05-25 02:21:09

I was thinking to a 3D representation of a library or a zoomable interface. It could leverage our good spacial memory while also allowing accidental discovery of books

benatkin | karma 7495 | avg karma 1.83 · | 2016-08-15 20:59:07+00:00

Only if it gets big enough that reading through all of them is too much work. They could curate it instead.

thechao | karma 4578 | avg karma 3.54 · | 2020-06-29 18:03:01+00:00

When I first heard of Amazon I thought it’d be like this: a curated list presented as spines-on-shelves. It feels like spines-on-shelves organized-by-Dewey let’s me use contextual knowledge to rapidly browse books in a way that the much higher-information-content “list of books” in, say, iOS books, or Kindle does not.

I guess I want curation and taxonomy, but book information should be minimal until I “pull it from the shelf”. I’d support (Patreon) a group that did high-quality curation.

reply

specialist | karma 10439 | avg karma 1.48 · | 2021-08-29 09:05:50

Remember when companies tried maintaining their own libraries?

Mid 1990s, during the infatuation with "learning organizations", we really struggled with onboarding and knowledge sharing. Surely we can do better, right?

So I got my archivist buddy hired. Extract domain knowledge from teams and individuals. Collect, aggregate, curate, and then reshare. Maintain our "library". Populate it with all the manuals, installation disks, training materials, textbooks, etc.

We'll never reinvent the wheel again. Woot!

Flew like a lead zeppelin.

--

Older me understands:

1) Forgetting is crucial to learning, moving forward, adapting.

2) Often times starting over is cheaper than finding prior answers.

I often wonder about my prior enthusiasm for Remember All The Things. Probably some mix of technophilia and existential dread (fear of being forgotten).

Old me rejects Chesterton's fence. https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...

Any decision or rule without an attached name (advocate) is fair game for culling. If it was truly important, someone would care. Opposing change on principle is just being reactionary. Which isn't very helpful right now.

Yay for people who do work to remember. I'm in awe of modern historians like Jill Lepore. She's like a hacker or a genius, in that I can't even imagine how she comes up with her original content.

I'm not post-modern. We can learn plenty from the past. Alas, most first-person story tellers are unreliable narrators. And will probably record and archive all the wrong stuff.

Writing this out... I guess that's the difference between archivists and historians. There's no way for archivists to know what details may be important later.

reply

pavel_lishin | karma 46523 | avg karma 3.63 · | 2018-06-21 13:51:57

I think the modern equivalent is the wikidive.

I too lament the lack of a huge library at home, but I'm not worried about other people losing this. The kind of people who consume nothing but funny cat videos aren't typically the kind of people who would stock their home with reference materials anyway, right?

reply

anotheraccount9 | karma 375 | avg karma 2.57 · | 2023-01-06 07:29:04

I hoard data. I've been using the Johnny Decimal system with the Dewey Decimal classes for the most part (100'000+ books and documents selectively saved on all subjects). I also have a more dynamic sections with articles, medias, and projects. These don't include libgen or similar collections. I use Recoll to index and search (https://www.lesbonscomptes.com/recoll/pages/index-recoll.htm...).

A few things I've noticed over the years: 1. I obviously don't read all of that stuff, but it's very satisfying to find useful information from an obscure book when looking for something.

2. I cannot find anymore online some of the documents and videos I've saved. So I think that's a win to have them locally, as long as what I'm keeping is potentially useful.

3. To find info, nothing works perfectly. That's why I'm making an effort to use descriptive and structured folders, good filenames (often followed by the original filename), and sometimes an additional text file as meta for context. Still, my bookmarks are a mess.

reply

FalconSensei | karma 1447 | avg karma 1.68 · | 2020-12-14 02:17:04+00:00

Also, user review, lists and tagging/bookshelves. Just massive data

MicahWedemeyer | karma 3961 | avg karma 4.75 · | 2008-08-26 15:13:17

Why focus solely on the text Web? The article mentions archaeological value of the data, and how trash dumps are more important than great books. On the Internet, I think that translates to archiving things like World of Warcraft and Punch the Monkey. It's not highbrow, but it gives a good idea of how we're spending our time.

truffle_pig | karma 348 | avg karma 13.92 · | 2017-10-31 11:58:28

Hey yep I'm the developer. What's your use case for this feature? Is it just overwhelming with too many books?

hannasanarion | karma 4380 | avg karma 2.98 · | 2018-10-18 20:05:56

I am interested.

Is there a system for uploading things into collections? Plans for including OCR? I'm looking for a way to manage my library of reference ebooks.

reply

Quequau | karma 1997 | avg karma 2.84 · | 2019-03-23 12:35:22

Back when I was getting my degree and was reading & writing a lot I got really enamored with the idea having an extensive repository of everything I had read or learned or simply come across and found interesting.

There are all sorts of software packages designed to facilitate this, I used DevonThink for a fairly long time. However, it never turned out to be nearly as useful or fulfilling as I had imagined going in and truth be told I just couldn't maintain the self discipline that the curation requires.

So now I have this ginormous unused data store that I don't want to mess with but nevertheless still can't bring myself to delete.

reply

ilaksh | karma 9227 | avg karma 1.28 · | 2023-10-19 18:40:06

It would be interesting to build something like this but with real books from Z-Library or something. Not saying that's necessarily ethical or not.

But it would be interesting to see the physical scale of large archives.

reply

jamieadams | karma 5 | avg karma 1.25 · | 2019-05-27 19:45:39+00:00

I use DevonThink. It's an incredible piece of software.

I have a database for 'research' where items are stored according to the Dewey Decimal categories (no particular reason for choosing this other than it's well organised and well documented).

I then have a separate 'memex' database which contains my journal and other miscellanies.

DevonThink has a rudimentary AI system which attempts to link documents by context. I find it quite useful.

reply

iammjm | karma 635 | avg karma 3.32 · | 2023-10-09 03:10:57

I think it's nice to see my own library grow. It will also be interesting to revisit it one day. It's like a time capsule. Plus it's thousands of pages of input to create an ai-version of myself one day ;)

darkpuma | karma 3018 | avg karma 1.66 · | 2019-03-29 23:20:29+00:00

> "I wouldn't store a book about scuba diving with or relate it to my vacation pictures of me scuba diving in any way in real life"

You don't put every single physical item you have relating to scuba into one big bucket because in the physical world, putting something into one bucket precludes it from being placed into another bucket. When you are organizing things you have to anticipate your most likely queries in advanced. So maybe you have one shelf for science fiction, and another for marine wildlife, and another for all your programming language books. For a while, this works well enough. One book can't go on two shelves, but with few enough books that's not a deal breaker.

However, when your personal library grows large enough, this system rapidly becomes impractical. At this point, the librarian resorts to creating a card catalogue, categorizing books by author, subject, etc. Books which have numerous topics have numerous cards in the card catalogue.

Since this is the 21st century, card catalogues are now digitized. The digitization of card catalogues saves space, but more importantly it facilitates more powerful queries. You can grab two different stacks of cards, relating to two separate topics, and instantly find the intersection between them. That is, the list of books which appear in both stacks. This greatly reduces the number of cards you must look through to find a book that you're quite certain is in both of the stacks. For instance each stack might have a thousand cards, but the intersection of those two stacks might only be 50 cards. That's 50 cards you have to search through to find your book, instead of a thousand.

A file tagging system is a digital card catalogue for your files. And as librarian of your library, you can decide which subjects warrant their own tags and which don't. What you don't have to do is decide ahead of time what sort of queries you want to facilitate. And it's no longer important to keep your collection of books and photographs separated. If you want a book, you intersect whatever you're looking for with the card stack of all books. If you want a photograph, you intersect it with the card stack of all photographs. If you have a book of photographs, it could even be in both, should you want it to be.

reply