Hacker Read

jshen · 2022-08-25 13:50:59

That’s how you make a worse search engine than Google. If you are serious about competing in that space I think you need to do something fundamentally different than Google. Treating pages as a bag of words leads to a shitty search engine. Like I said, I’ve built a few search engines, and I have tried this.

Edit: https://en.wikipedia.org/wiki/Bag-of-words_model

reply

pictur | karma 103 | avg karma 0.18 · | 2022-02-17 00:13:17

When I saw the reddit example, I thought of quora. Quora will never be a search engine because all users need to stand out is to add something. and what is usually added is garbage.

user24 | karma 4152 | avg karma 3.28 · | 2012-03-17 23:46:46+00:00

Yeah semantic search, if solved, would address this problem.

That's really what I was getting at. Stripped right down, Google is still just viewing documents as a bag of words[1]. I mean, they have pagerank and they will apply more weight to words in headings, and they have 6-gram indexes and synonyms and all that clever stuff, but at it's core it's still lexically centered not semantically centered.

[1] Further reading: http://en.wikipedia.org/wiki/Bag_of_words_model

reply

kiwidrew | karma 1182 | avg karma 3.51 · | 2024-02-04 16:30:34

excellent! I'm tired of search engines that optimize for natural language queries because the inevitable trade-off is that they become useless at keyword/exact queries.

lovecg | karma 1824 | avg karma 2.8 · | 2021-11-12 16:30:14

I mean “wiki <search>“ works on Google too. But I don’t want to type anything extra. My problem here is if we assume the search engine should be good at predicting what I actually want to see for a given term, Google is failing at that.

quizbiz | karma 2610 | avg karma 2.36 · | 2009-05-13 20:32:47

I think this is the search system Wikipedia has been needing to sort its data. This is not a tool for discovering knowledge like Google is.

imp | karma 1274 | avg karma 2.1 · | 2009-02-04 02:08:23+00:00

The problem with that is that your search ranking for any given key word is probably different in each search engine.

gumballindie | karma 2663 | avg karma 1.25 · | 2023-05-05 11:46:04

A cool idea, but what I don't understand is why are many alternative search engines so poorly designed? I may be too used to google but I think the text should be a bit more readable. But I really like this type of search engines - always wanted search capability among my bookmarks and related websites. Perhaps even a bit of ai fine tuning around them.

nostrademons | karma 78749 | avg karma 5.45 · | 2011-01-02 04:29:54+00:00

Problem is that this was tried before, with SearchWiki. When it launched, it was widely derided as being useless and distracting. Its usage numbers didn't show widespread adoption. And then when it was removed, there was much rejoicing.

Now, it's possible that SearchWiki just needed a few more iterations, and with a few details changed, could be a big success. There have been a few other recent launches that were tried years ago, didn't work then, but had a few more iterations and now are big successes. I could at least raise the issue. But unless I can tell a convincing story about why people would use this when they didn't use SearchWiki, it may be an uphill battle to get resources devoted to this.

reply

larrydag | karma 1249 | avg karma 2.05 · | 2017-03-26 18:44:16+00:00

I definitely think there is room for search improvement. I believe the next area of search is contextual search (https://en.wikipedia.org/wiki/Contextual_searching). If you can combine what the user is looking for to actual website content then I think you might be onto something. The trick is finding that link function. Traditionally Google has relied on keywords and ranking by links. There could be other ways to find that user/content relationship.

Vik1ng | karma 975 | avg karma 5.36 · | 2014-05-29 13:43:49

https://en.wikipedia.org/wiki/Organic_search

So what you say might answer the first question, but not the follow up one. Well, your "e.g. on GOOGLE:" answers it, but the fact that you use google for that example already shows that there really isn't any way around that. He wasn't asking how to improve ranking. He was asking for alternatives to google search traffic. (Unless I got that wrong)

reply

jsight | karma 6940 | avg karma 1.91 · | 2017-11-22 00:09:08

Good luck turning that into an objective standard for a search engine.

AtNightWeCode | karma 391 | avg karma 0.41 · | 2021-11-08 14:13:00

BS. This can be solved by how to design the UX among other ways. This problem is not equal over search engines as well. There are other things that annoys me more like Google and the problems with N-Grams.

dumbfounder | karma 3015 | avg karma 4.01 · | 2019-04-19 17:55:15

Anyone can make a search engine fast. It's much harder to make it good.

brianpan | karma 1759 | avg karma 3.03 · | 2012-12-13 16:32:20+00:00

Well, that's the point I'm trying to make. It's not a hit the drawing board problem, it's a refine the algorithm problem. How many "if you google this string, you can't find the right site" problems have there been in the life of Google search? They continue to refine pageranking don't they?

stingraycharles | karma 11619 | avg karma 4.34 · | 2023-01-04 06:11:32

The problem with this idea is that it’s way too much effort for the average individual to curate their search this way. And if the results are bad, I don’t know whether it’s my search query, the way I curated things, or your search algorithm.

okeumeni | karma 775 | avg karma 1.98 · | 2008-06-18 22:10:49+00:00

I’m not sure how good an idea of a Wikipedia search is. It certainly cost too much to get there for Powerset.

Also they should not call themselves search engine in respect to all those working hard every day to bring a real challenge to Google and improve the overall search engine project.

reply

PaulHoule | karma 78160 | avg karma 2.48 · | 2021-09-16 13:11:38

Meta search engines leave a bad taste in everyone's mouth because they've always failed. Here is why

https://en.wikipedia.org/wiki/Arrow%27s_impossibility_theore...

You can't combine a few different ranked lists and expect to get results better than any of the original ranked lists.

reply

seba_dos1 | karma 7517 | avg karma 2.24 · | 2023-04-10 22:07:11

It's a language model, not a search engine. It doesn't work well as one unless integrated into an actual search engine, like Bing does. Without such integration, it's much closer to human memory than search engine - it will recall stuff it has seen many times pretty well and completely fail at stuff it just glanced over once, filling any gaps with made up stuff like a kid on an exam hoping to get at least a few points with their wild guesses.

ramraj07 | karma 10260 | avg karma 3.74 · | 2023-05-20 20:59:07

I mean Google search does it obviously using page rank which is that if someone links to the page with that word it uses it. Me I was searching for arcane words that I remembered hearing in the middle of a podcast. I doubt anyone actually linked or searched for the same with relevance to that podcast. Also the solution you're saying needs far more intuitive and intricate development than just indexing captions lol.