Precisely the point. If you don't count searches for popular libraries and frameworks into their respective language buckets, the results are mostly bullshit, and at best probably reflect the undergrad curriculum.
I agree completely. I never said that it should be used for evaluation or that you shouldn't read the code or docs.
But then the question is... how do you find high quality libraries without something like download count? The issue is signaling good libraries, with poor discovery. I subscribe to many "libraries worth looking at" mailing lists, the awesome lists, etc. but that is still a pretty coarse net. Searching for libraries on GitHub if you don't know the name already or don't really know what you're specifically looking for is an even worse experience.
Combine that with relatively poor fuzzy search on Hex, where even if you DO know what you're looking for, it leads to a very subpar experience. For example, "e-commerce" leads to different results than "commerce". [0][1] These then exclude [2], which is an ecommerce library available on Hex without that keyword.
This is awful. Code Search is an indispensable tool for finding reference code and real-world uses of various libraries. I don't know of anything on par with it.
I use global search from time to time to see how other projects use certain libraries. When the documentation of said libraries is sparse this can sometimes be a good timesaver.
Thank you for your advice. I put this list together. There are actually some libraries I haven't used, and I've considered using the GitHub API for searching. But you know, GitHub has a huge user base and repositories, and I haven't tried that yet.
Yeah, author seems to heavily underestimate his own needs vs general needs. But for what Google et al know about me the results could indeed be more precise. I have developed a habit of appending “GitHub” in the search query for when I am actually looking for source code vs just trying to find a page that just downloads me a video.
I typically search for "awesome $languagename" to get a list of curated libraries that people use and are stable. It's a decent shortcut to find what's good.
Personally I don't understand why there are so few search libraries/systems to choose from, given that "search" is one of the fundamental pillars of CS.
I hadn't had much luck with any of these. Phind is indeed better for programming but it still has the same trouble with the obscure stuff.
It seems that the main problem right now is that they only look at the first few results. Basically they all do this:
1. Ask model for a search query.
2. Run search and feed the results into the model.
3. Ask model for an answer.
The problem is that for very specific questions the search results won't contain the required information. So it has to say something irrelevant or hallucinate. Especially for negative queries like "a foo library that doesn't depend on std" or similar it really struggles because it can't effectively filter out libraries that fail the requirements and can't keep searching until it finds one.
Basically they can be fast if a decent query can have the answer within the first page of results. But fail otherwise. In many cases you are better off just reading through search results on your own.
iusethis.com is probably a better resource for finding software that already exists. If the goal is to find ideas for software that doesn't exist my request would be something that makes sense out of Twitter searches. So much garbage, duplication and SPAM. It seems like there should be some better method even if it was simply a vote up/down style ranking system within the search results.
It's a good start but it needs many more features to be considered usable. For example, I just did a search for "redis" and about 30 libs came up. Half of them don't even have a description.
How can I know which one to pick at a glance? Asking me to evaluate and profile 30 different libs for just a single task is unreasonable.
Also this page isn't even on Google's first page for when you search for something. I didn't even know it existed until you mentioned it.
That's what I used to do, but i realized that github search is terrible for some reason. They don't display all results, so I would think that I've found the most popular library for X, just to find out later that it's not, and there's a completely different one that didn't show up on Github search. That's why I don't trust github search anymore and use the site:github.com approach.
All the time. Finding obscure uses of libraries/APIs or just how certain code snippets are used, or even just repos about a general topic/implementation of a thing. It’s very sensitive and oftentimes doesn’t work well but I’d prefer to spend 30 seconds trying a couple different variations of my search than get no results at all. Google’s open source search is probably better but until they index Github it’s practically useless for me. Github search may be a bad search tool but by nature of having so much code it’s still useful for me.
Last few weeks I have been working on a project in Go that required using multiple libraries I had no previous experience with. Usually when in doubts I would Google or go to Stackoverflow or Github issues.
But Google results today are mostly spam or copycat sites that echo forever some old answers. And Stackoverflow answers are too often outdated for fast evolving language leading onto false paths.
Surprisingly I started just go to source libraries and read the code. It provided answers faster and with less noise.
I've tried it out. It's quite obvious the limited number of crawled sites when searching for anything obscure or one step outside of programming.
Even 'javascript reverse string', which I expected some docs or stack overflow pages seems to give me a HN thread, someone's github repo and a not very related SO thread.
Is MDN, MDSN, more dev docs documentation on the roadmap?
It's definitely an interesting technique. Do you have anything in place to detect garbage, substenceless articles like which has started popping up on Google?
I've seen the occasional one using github repositories or pages. Looking at the current list you're broadly reliant on moderators and communities, and as the search engine you moderate which sites are indexed.
Sure, I did. I've implemented a proper search for a very big companies document and knowledge base. Similar to gmane, which also used xapian. Much better than what I see there. Or with the old google code search
reply