Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Hardly. The promise is still there, but there are barriers in place to get there.

One of the most useful aspects of the semantic web is how it enhances the search for information. Some web citizens have become conditioned to see Google as the pinnacle of what we can achieve through search, but we can do a lot better. Let's use an example to illustrate this. Imagine a presidential election was taking place and you want to understand the positions of the candidates on topics that matter to you. Let's say foreign policy was something you were interested in, including their proclivity for war. By allowing for searching on a richer set of metadata you can more easily access the information about the positions of these candidates, without the distortions of Google's page rank algorithms. Think of it like treating the information of the web as a database you can query more directly. That's the main promise of the semantic web.



sort by: page size:

Ah yes, the semantic web. Remember when that was going to change everything? All it required was for everyone on the web to meticulously format their data into carefully structured databases. I can't imagine why it never gained much traction.

As an interesting side note, part of the reason of the success of google is that through the page rank algorithm they were able to extract highly important implicit data on the relative popularity, authoritativeness, and context of links rather than being forced to rely only on explicit data.

I will make the claim that in the future devices and systems which work on implicit context and metadata are going to be more successful at these sorts of high level pseudo-cognitive capabilities than anything which is dependent on people changing the way they do everything.


I think you mean when semantic web was the next Big Thing. Better search would have been a nice side effect of a semantic web.

I was a big enthusiast of all the potential a semantic web could have brought. In my opinion due to the rise of social networks, content got heavily centralized, and it wasn't in the centralized platform's interest to annotate it, or allow others to consume it. For a short period of my time on the internet, when most in my close circle of friends, had WP blogs and blogrolls in the sidebar, you'd get at least some FOAF annotations on those links. I could complain for a whole afternoon about the clunkiness of supporting software and technologies (RDF, triple stores, ontologies, graph databases, etc), as it wasn't easy as a coder to hack on these technologies developed by consortiums.

As far as semantic search goes. I think that due to the heavy SaaS-ification of software, now there isn't even an incentive to create better search tooling. I know the landscape of search systems is huge, and while there is no way for me to assess all existing software, I've just stuck with the the tried and tested Apache Solr (or ElasticSearch) on projects I worked on. And those are not easily tweaked into semantic search engines.

My experience with Google Search for the past few years is that results based on their knowledge graph have been gamed heavily. There's a lot of junk in those results that is very much adjacent to the keywords I'm searching. You'll see the common suggestion on HN as well, when you search for a product review to also include news.ycombinator.com/reddit.com in your query based on the type of product you're looking for.


It's a pity the semantic web never took off. It might have greatly reduced the need for sophisticated centralised search-engines.

Wasn’t this what the Semantic Web was supposed to enable?

The great idea of semantic web, as conceived in the early beginnings, was mostly visionary and rather impossible to build. This doesn't mean the that whole concept has failed. It advanced many fields of AI and created many initiatives which are quite vibrant to this day (e.g. Linked Data). The illusion of failure of the semantic web is mostly due to the fact that today there are less semantic web projects funded. The money goes to other things that are now on the rise, the same as it was with semantic web 10 years ago. However semantic web is still here, with many tools mature enough to be applied in the industry. I don't expect it to dissapear completely.

Not going to happen. The reason for the Semantic Web never taking off were never technical. Websites already spend a lot of money on technical SEO and would happily add all sorts of metadata if only it helped them rank better. Of course, many sites’ metadata would blatantly “lie” and hence, the likes of Google would never trust it.

Re exposing an entire database of static content: again, reality gets in the way. Websites want to keep control over how they present their data. Not to mention that many news sites segregate their content as public and paywalled. Making raw content available as a structured and query able database may work for the likes of Wikipedia or arxiv.org. But it’ll not likely going to be adopted by commercial sites.


A lot of the semantic web has evolved, spurred on by SEO and the need to accurately scrape data from web pages. The old semantic web seemed to be more of a solution in search of every problem. I'm not surprised that searches for "semantic web" are down - as most interest now is focused on structured data via microformats, LD-JSON and standards published at schema.org.

Ah, I think we might be talking about different things. I think the larger promise of the semantic web is a categorically different thing than adding a bit of meta data to pages to know basic things like author, content type, description, etc.

It’s the latter I think is clearly valuable, in order for us to have competition for the likes of google and Facebook. It lowers the barrier for creating competing search engines, modern rss readers, and even things like distributed social networks.


I think the people really pushing for the Semantic Web kind of gave up. You hardly ever hear that term anymore.

I guess the value proposition of "You can add a whole bunch of complexity to your webpage that won't affect what people see so robots can scrape your page easier" didn't really resonate with developers. Also, the proposals I saw were much too granular and focused on people writing scientific papers on the web. It wasn't a good mesh for the "garbage" web, which is like 99% of everything.


I think in a way it has, just not in the purist sense that Semantic Web proponents have advocated.

Microformats, semantic HTML tags, and non-rendered meta-data can be found all over the place. Especially in eCommerce.

What hasn't caught on is the formal language of the semantic web. And for that my hypothesis is that it is so formal and so strict that it produces a barrier of entry that is too high for non-academics. RDF has a bit of a leaning curve compared to competing technologies. Furthermore, FOAF (Friend of a Friend) last time I looked at it had a lot of outdated domain specific tags. For example, there is a tag for ICQ (an instant messenger practically no one uses anymore but was very popular in the 90s) but no tags for new technology rather than, say, having generic tags that can be used for any messenger system and using attributes.

Summary:

- High barrier of entry

- Easier to use alternatives

- It hasn't really failed, it's just taken a form that is more friendly for day-to-day

Edit: Also, if you go the Semantic Web Meetup in Cambridge there are a lot of talks of the medical industry, database industry, and things like that which show it is alive https://www.meetup.com/The-Cambridge-Semantic-Web-Meetup-Gro...


I was a big believer in the semantic web for years, but there is a load of things wrong with it from conceptual problems to practical ones.

For starters the Semantic Web requires an enormous amount of labor to make things work at all. You need humans marking up stuff, often with no advantage other than the "greater good". In fact you do see semantic content where it makes sense today. Look at any successful websites header and you'll see a pretty large variety of semantic content, things that Google and social media platforms use the make the page more discoverable.

This problem is compounded by the fact that ML and NLP solved many of the practical problems that the semantic web was supposed to. Google basically works like a vast question answering system. If you want to find pictures of "frogs with hats on" you don't need semantic metadata.

A much larger problem is that the real vision of the semantic web wreaked of the classic "solution in search of a problem". The magic of semantic web wasn't the metadata; RDF was just the beginning.

RDF is literally a more verbose implementation of Prolog's predicates. The real goal was to build reasoning engines on top of RDF, essentially a prolog like reasoner that could answer queries. A big warning sign for me was that the majority of people doing "Semantic Web" work at the time didn't even know of the basics of how existing knowledge based representation and reasoning systems, like prolog, worked. They were inventing a Semantic future without any sense that this problem has been worked on in another form for decades.

OWL, which was the standard to be used for the reasoning part of the semantic web was computationally intractable in it's highest level description of the reasoning process. If you start with a computationally intractable abstraction as your formal specification, they you are starting very far from praxis.

For this reason it was hard to really do anything with the semantic web. Virtually nobody built weekend "semantic web demos" because there wasn't really anything you could do with it that you couldn't do easier with a simple database and some basic business logic... or just write in Prolog.

A few companies did use semantic, RDF databases but you quickly realize these offered no value over just building a traditional relational database, and today we have real graph data bases in abundance so any advantage you would get form processing boatloads of XML as a graph can be replicated without the markup overhead. And that's not even considering the work in graph representation coming out of deep learning.

Semantic web didn't work because it was half-pipe dream, and not even a very interesting one at that.


The potential of the semantic web is massive. It's hard to understand why it hasn't been a massive game changer. I remind a few years ago making crazy query to answers questions that still today has no equivalent like: Find companies CEO that has less than 100k employees and was created before Neil Amstrong walked on the moon. The winner take all approach we have today with all those sillos doesn't benefit human kind in any way.

It's important to not dwell on the past. In the present here are two strong developments issuing from the original Semantic Web:

Especially in Europe, research is very active and these technologies are core to many big science projects. (For example in this thread https://news.ycombinator.com/item?id=8510885). These projects are exploring the higher aims of TBL's proposal - with good cause.

On a more down-to-earth level, there is now a solid web metadata standard in place in JSON-LD. The big search engines index it and presumably use it to give better results. Any startup can add value to published data by adding links - in a significant extension to the "API economy".

Think about it. The base concept of the semantic web is simply a data exchange format that can be used to implement a distributed relational database - a pretty practical idea. By way of the false starts of any broad initiative (eg XML) and withstanding a lot of political spin that I've never understood, we now have that standard.

Web developers should look at this opportunity with new eyes.


The semantic web is now integrated into the web and for the most part it's invisible. Take a look at the timeline given in this post: https://news.ycombinator.com/item?id=3983179

Some of those startups exited for hundreds of millions, providing, for example, the metadata in the right hand pane of Google search.

The new action buttons in Gmail, adopted by Github, are based on JSON-LD: https://github.com/blog/1891-view-issue-pull-request-buttons...

JSON-LD, which is a profound improvement on and compatible with the original RDF, is the only web metadata standard with a viable future. Read the reflections of Manu Sporny, who overwhelmed competing proposals and bad standards with sheer technical power: http://manu.sporny.org/2014/json-ld-origins-2/

There's really no debate any more. We use the the technology borne by the "Semantic Web" every day.


Former semantic web researcher here. The problem with semantic web is that it's way too complicated for the average user (unlike html), with no clear payoff/ I see the need for Machine learning to automate several of these tasks. It is a laudable effort, but I'm not sure if it'd be anything more than some marginal improvements in data interop on a web-scale.

I think the semantic web never worked because of seo spam. The closest it got to adoption in any form was the keywords meta tag. We know how that ended up.

I've looked into Semantic Web Technologies for a year now and trying to come to a personal conclusion at the moment. This is my current state, through some of this may be premature:

PRO:

* I can see that the semantic annotation part of it is spreading. Schema.org / JSON-LD might be the first pragmatic solution that I can imagine actually getting more widespread acceptance. Especially if currently existing Frameworks / CMS add support by default.

* Semantic Annotations are helping big companies like Google to make their products smarter and this is happening right now.

* Semantic Web tries to solve some real problems, not just "academic" problems. Information and Knowledge is indeed rather unconnected which reduces its value tremendously. Right now APIs grow to make this mor accessible, but there are many problems unsolved.

* SemanticWeb has some truly interesting ideas and concepts, that I've grown to like. Of course nearly every one of them could work without buying the whole Semantic Web. But still, I think some very interesting ideas come out of that community.

CON:

* It takes a lot of time to understand the Semantic Web correctly and learning about the technologies behind gets soon very mixed up with a lot of complicated and rather uncommon concepts, like Ontologies.

* The tools (even Triplestores) feel awkward and years behind to what I'm used to as a web developer. There are a LOT of tools, but most seem to be abandoned research project which I wouldn't dare to use in production.

* It gets especially complicated when entering the territory of the Open World Assumption (OWA) and the implications that has on reasoning and logic. Say you want hard (real-time) validation because data is entered through a form on a website. Asking some people from the Semantic Web Domain, the answers varied from "I don't know how to do this" up to "Its complicated, but there is some research... , additional ontology for that...". I'm kind of shocked since this is one of the most trivial and common thing to do in the web. And I really don't want to add another complex layer onto an already complex system just to solve simple problems. Something's wrong with the architecture here.

* OWA might be interesting, but most applications / databases are closed world and it would make many things very complicated to try fit it into the Open World Logic. OWA is an interesting concept and makes sense if you are building a distibuted knowledge graph (Which is a huge task and only few have the ressources to do it), but most people will want to stay closed world just because its much more easy to handle. The Semantic Web seems to ignore reality here and prefers to be idealistic, imho.

* This sums up to me to one big problem: The Semantic Web Technolgies provide solutions to some complex problems, but also make some very easy things hard to do. As long as it doesn't provide some smart solutions (with a reasonable amount of learning / implementation time) to existing problems, I don't see that it will be adopted by the typical web developer.

* There are not enough pragmatic persons around in the community, that actually get nice things done that produce that "I wan't that, too!" effect.


First, I agree and would also like to say the semantic web idea is roughly 25 years old, which is such a short time period that saying something is dead/failed seems a little silly.

Second, I would say the idea is more realistic now than ever.

Now we have:

- Git so you can cheaply & collaboratively experiment + evolve both data encodings and schemas

- Faster, more reliable type checkers and code/data refactoring tools

- Deep learning agents that are showing promising gains in question/answering challenges so I can see in the very near feature DLAs coauthoring semantic content alongside humans

- As always, more programmers, faster bandwidth, more data, faster computers, etc.

I think technology is getting to the point where the semantic web could happen relatively quickly.

That being said, I still don't know if it will happen because I now am not sure if there is are strong economics forces that would impede such a thing.


I'm quite sceptical that "a queryable ontology. With a rich and expressive grammar" would be so obviously easy to use and great that it threatened Google, rather than Google being a frontend for said ontology (that you still need to crawl first to use!). And indeed, what remains of semantic-web like data is massively pushed by Google today, because it makes it a easier for them to provide results based on that - and if you want to convince someone to add it to a website in a commercial setting, "it helps Google understand our site" is the primary argument that sticks (even though it also helps others parse sites).

Which IMHO points to the main problem: Publishing semantic data is work, and had no clear value proposition you could sell businesses on, and it ran out of steam before someone made a convincing one. Niches that see the need for such data publishing still are willing to use this stuff or alternatives, but for the general majority of publishers it isn't there, or actively seen as a negative.

It also didn't help that IMHO the focus was to much on what was theoretically possible, but not on making it actually easy to use, which made even more devs ignore it or build alternatives because the entry hurdle is steep. Plenty APIs could build on semantic web tech, but they don't because a custom REST API is typically just easier to do and thus more familiar. (Despite semantic tech having the groundworks for lots of what's seen as new-ish trends like API generators/machine readable API docs/...)

next

Legal | privacy