Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

That’s how you make a worse search engine than Google. If you are serious about competing in that space I think you need to do something fundamentally different than Google. Treating pages as a bag of words leads to a shitty search engine. Like I said, I’ve built a few search engines, and I have tried this.

Edit: https://en.wikipedia.org/wiki/Bag-of-words_model



sort by: page size:

When I saw the reddit example, I thought of quora. Quora will never be a search engine because all users need to stand out is to add something. and what is usually added is garbage.

Yeah semantic search, if solved, would address this problem.

That's really what I was getting at. Stripped right down, Google is still just viewing documents as a bag of words[1]. I mean, they have pagerank and they will apply more weight to words in headings, and they have 6-gram indexes and synonyms and all that clever stuff, but at it's core it's still lexically centered not semantically centered.

[1] Further reading: http://en.wikipedia.org/wiki/Bag_of_words_model


excellent! I'm tired of search engines that optimize for natural language queries because the inevitable trade-off is that they become useless at keyword/exact queries.

I mean “wiki <search>“ works on Google too. But I don’t want to type anything extra. My problem here is if we assume the search engine should be good at predicting what I actually want to see for a given term, Google is failing at that.

I think this is the search system Wikipedia has been needing to sort its data. This is not a tool for discovering knowledge like Google is.

The problem with that is that your search ranking for any given key word is probably different in each search engine.

A cool idea, but what I don't understand is why are many alternative search engines so poorly designed? I may be too used to google but I think the text should be a bit more readable. But I really like this type of search engines - always wanted search capability among my bookmarks and related websites. Perhaps even a bit of ai fine tuning around them.

Problem is that this was tried before, with SearchWiki. When it launched, it was widely derided as being useless and distracting. Its usage numbers didn't show widespread adoption. And then when it was removed, there was much rejoicing.

Now, it's possible that SearchWiki just needed a few more iterations, and with a few details changed, could be a big success. There have been a few other recent launches that were tried years ago, didn't work then, but had a few more iterations and now are big successes. I could at least raise the issue. But unless I can tell a convincing story about why people would use this when they didn't use SearchWiki, it may be an uphill battle to get resources devoted to this.


I definitely think there is room for search improvement. I believe the next area of search is contextual search (https://en.wikipedia.org/wiki/Contextual_searching). If you can combine what the user is looking for to actual website content then I think you might be onto something. The trick is finding that link function. Traditionally Google has relied on keywords and ranking by links. There could be other ways to find that user/content relationship.

https://en.wikipedia.org/wiki/Organic_search

So what you say might answer the first question, but not the follow up one. Well, your "e.g. on GOOGLE:" answers it, but the fact that you use google for that example already shows that there really isn't any way around that. He wasn't asking how to improve ranking. He was asking for alternatives to google search traffic. (Unless I got that wrong)


Good luck turning that into an objective standard for a search engine.

BS. This can be solved by how to design the UX among other ways. This problem is not equal over search engines as well. There are other things that annoys me more like Google and the problems with N-Grams.

Anyone can make a search engine fast. It's much harder to make it good.

Well, that's the point I'm trying to make. It's not a hit the drawing board problem, it's a refine the algorithm problem. How many "if you google this string, you can't find the right site" problems have there been in the life of Google search? They continue to refine pageranking don't they?

The problem with this idea is that it’s way too much effort for the average individual to curate their search this way. And if the results are bad, I don’t know whether it’s my search query, the way I curated things, or your search algorithm.

I’m not sure how good an idea of a Wikipedia search is. It certainly cost too much to get there for Powerset.

Also they should not call themselves search engine in respect to all those working hard every day to bring a real challenge to Google and improve the overall search engine project.


Meta search engines leave a bad taste in everyone's mouth because they've always failed. Here is why

https://en.wikipedia.org/wiki/Arrow%27s_impossibility_theore...

You can't combine a few different ranked lists and expect to get results better than any of the original ranked lists.


It's a language model, not a search engine. It doesn't work well as one unless integrated into an actual search engine, like Bing does. Without such integration, it's much closer to human memory than search engine - it will recall stuff it has seen many times pretty well and completely fail at stuff it just glanced over once, filling any gaps with made up stuff like a kid on an exam hoping to get at least a few points with their wild guesses.

I mean Google search does it obviously using page rank which is that if someone links to the page with that word it uses it. Me I was searching for arcane words that I remembered hearing in the middle of a podcast. I doubt anyone actually linked or searched for the same with relevance to that podcast. Also the solution you're saying needs far more intuitive and intricate development than just indexing captions lol.
next

Legal | privacy