Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I’m not sure how good an idea of a Wikipedia search is. It certainly cost too much to get there for Powerset.

Also they should not call themselves search engine in respect to all those working hard every day to bring a real challenge to Google and improve the overall search engine project.



sort by: page size:

Powerset deals with Spam the way every other search engine deals with spam. It's not like wikipedia is free of false information (every snapshot is a unique snowflake with some entertainingly wrong things, like that ReiserFS-kills-wife-chart).

Sometimes people have difficulty separating NLP from AI from an Arbiter of Truth. NLP is just understanding semantic information, it's not arbitrating truth/correctness (a.k.a 'you can't squeeze blood from a stone'-principle) nor is it some kind of HAL9000 that can take queries like, "9 people who have been CEOs and who have Christian names" and make sense of them.

In this regard of correctness/spam, Powerset and Google are in an equal starting position. If you search for "cures for cancer" or "causes of autism" on google you get some pretty factually incorrect results in the top 4.

In the case of Spam, check out what Google is forced to do with "hot-button" spam searches, like "Mesolithioma". Where did your standard search page go? :)

Hopefully when powerset has a public product people can play with it and see that really, what we're doing is an evolutionary step forward from keyword search, not some kind of boil-the-oceans-and-google-was-always-wrong-anyways approach.


imagine if instead of typing amazon into the google search box you had to type "site to buy and sell books", and instead of Amazon you got back Half.com

Google did not invent a new way for people to search, they took the way people were used to search and made it return more relevant results. I believed Powerset was bound to fail from the beginning. the problem is when folks search for email, for files on a machine, for files on P2P networks or for web pages on Google they use the same keyword based method.

It might certainly be useful to have a semantic search capability for some obscure searches but that would work better as a Google add-on (and with all the engineers Google has in-house if building such a technology was worthwile they would do it).


Well I think it’s really hard to compare a search engine to another base on what happened behind the curtain. Search engine technology is a really complex matter. Any speculation done from the surface is probably a guessing game.

I will agree though that Hakia seems to be much closer to what they promise to deliver that Powerset. Hakia may have a bit of semantic flavor but remain overall a poor search engine. I always wonder what Powerset is doing with all the money they have raised. I would felt terribly disappointed if I have given them my money. Building a search engine for Wikipedia (not even a good one) with all that money is a little short.

I will take the opportunity here to express my reserve on semantic search. If semantic search is define as a search engines that answer questions, here are two reasons why I think it is not a very promising way for search:

1.It is hard for people in general to type an entire question, users are generally lazy and anything that makes them think is not potentially good. 2.The language factor. Though the web is mostly written in English it will be a challenge for these companies to implement a semantic search in every language. From English to French there’s a whole new world.


Search isn't easy and I would fully expect a company that specializes in search to provide better search results than an entity that simply has search as a feature to their main product. Since Wikipedia provides all of their pages for indexing, unlike Facebook for example, a good search experience would be an MVP for any competent search company.

The thing is, Wikipedia's search is pretty straightforward: type keyword, arrive on article page.

Anything you'd add to that would detract from the search experience, not improve it. Seems to me the reason why people use Google to search wikipedia is because it saves them a step, not because wikipedia's search is inherently broken.

In any case I'd like to take part in the contest, but so far I can't really think of a way to improve wikipedia's search…


After playing with WA for a while it seems to me that it is not so much a better search engine as it is a better encyclopedia. Many search results read like condensed Wikipedia articles.

It's not very difficult to make a functional search engine. With the amount of resources cuil had, all you would have to do is provision a few thousand servers and deploy a nutch cluster on them.

Then you pair random images with random search results. Voila, you have cuil.

The criticism is duly applied as Cuil made the claim that they would be better than Google at the same game. Powerset never made such a claim. The people who criticized it didn't have an understanding about what it was actually trying to do.


I'm pretty sure that it's not. It's probably useful as a knowledge engine, but not a search engine.

I think this is the search system Wikipedia has been needing to sort its data. This is not a tool for discovering knowledge like Google is.

A search engine that's just Wikipedia and Reuters sounds like the most boring website on the planet. Sometimes people don't want "just the facts," all the time. Sometimes they just want to have fun. And obviously "metaverse" won't yield many results in a 90s-style-website search engine.

I wonder how many real search engine queries only need to return a Wikipedia page.

Because, building a search engine is not the solution to everything?

Aren't keyword search sufficient? Like "w something" to search wikipedia for "something".

Problem is that this was tried before, with SearchWiki. When it launched, it was widely derided as being useless and distracting. Its usage numbers didn't show widespread adoption. And then when it was removed, there was much rejoicing.

Now, it's possible that SearchWiki just needed a few more iterations, and with a few details changed, could be a big success. There have been a few other recent launches that were tried years ago, didn't work then, but had a few more iterations and now are big successes. I could at least raise the issue. But unless I can tell a convincing story about why people would use this when they didn't use SearchWiki, it may be an uphill battle to get resources devoted to this.


I'd love to see a Wikipedia styled search where people can improve or flag results as they see fit. I wonder if that has been tried.

Sure it might not handle the long long tail but the top ten million searches would still be pretty useful.


That’s how you make a worse search engine than Google. If you are serious about competing in that space I think you need to do something fundamentally different than Google. Treating pages as a bag of words leads to a shitty search engine. Like I said, I’ve built a few search engines, and I have tried this.

Edit: https://en.wikipedia.org/wiki/Bag-of-words_model


I honestly wouldn’t refer to them as a search (engine). Implementing a keyword search with a Lodash function and a few if/else rules feels like it would provide better or comparable results to the current offering. I wouldn’t refer to that as an engine.

You are right. That's why it's an educational project and not a public search engine

They're basically just anthropomorphized search engines. I'd rather have a good search engine + FAQ page.
next

Legal | privacy