You could build an ontology in RDF [1] or OWL [2] and link the grammars together.
FOAF [3] was a means of describing your friends and social connections, and it allowed rich annotation.
No tools were ever built around this because by the time it started taking off, Facebook and MySpace were already a thing.
The great thing about these technologies is that the graph is public and you could write tools to ingest, export, and exchange the information. We never got to that, though.
The technology was centered around graphs and triple stores, allowing you to apply subject-predicate labels to anything. The URI became a central node, document, and ontology identification scheme.
There were languages like SPARQL for querying rich triplestore databases, but they never really took off. Again, Google and Facebook were already huge by the time these started to mature.
If you think the promise of blockchain is exciting, you should take a look at what the Semantic Web could have been...
The Semantic Web was coming into prominence at the peak of Bittorrent and P2P. It allowed people to publish data in schemas without centralized databases. Protocols in markup. You could declare and share schemas for things, build on top of other people's schemas, and remix endlessly. It was powerful.
You could define your contact details in FOAF and a client could ingest that and make contacts.
You could consume articles as RSS or Atom and use any client you wanted. Clients that were faster and more performant than HTML-based websites. We could have shed HTML and Javascript for many schema-aware applications.
If we'd built Reddit back then, it'd have been topics and comments that were digitally signed and exchanged p2p, with signed voting, interest graphs, and curated peer groups.
Unfortunately, this was just as the VC and ad-money fueled Google and Facebook were coming into prominence. They built centralized systems that were easy to use faster than the Semantic Web community could move. (Semantic Web was much broader - some were interested in predicate logic distributed databases, which were a bit much.)
Web 3.0 was the Semantic Web. Do not forget. We can still use the lessons today. This is what the web could have become if the advertising gravity well hadn't sucked it in.
I studied semantic web for a semester during college and I found it was too complicated and confusing. I remember stuff like RDF, Owl, ontologies, sparql, triple stores and others. But I never truly understood how was all connected. Besides, from what I remember, these tools were never production-ready since they came from academia. Although graph databases are gaining a lot of traction these days so there still might be some space for semantic meaning in some projects.
What a lot of folks don't realize is that the Semantic Web was poised to be a P2P and distributed web. Your forum post would be marked up in a schema that other client-side "forum software" could import and understand. You could sign your comments, share them, grow your network in a distributed fashion. For all kinds of applications. Save recipes in a catalog, aggregate contacts, you name it.
Ontologies were centrally published (and had URLs when not - "URIs/URNs are cool"), so it was easy to understand data models. The entity name was the location was the definition. Ridiculously clever.
Furthermore, HTML was headed back to its "markup" / "document" roots. It focused around meaning and information conveyance, where applications could be layered on top. Almost more like JSON, but universally accessible and non-proprietary, and with a built in UI for structured traversal.
Remember CSS Zen Garden? That was from a time where documents were treated as information, not thick web applications, and the CSS and Javascript were an ethereal cloak. The Semantic Web folks concurrently worked on making it so that HTML wasn't just "a soup of tags for layout", so that it wasn't just browsers that would understand and present it. RSS was one such first step. People were starting to mark up a lot of other things. Authorship and consumption tools were starting to arise.
The reason this grand utopia didn't happen was that this wave of innovation coincided with the rise of VC-fueled tech startups. Google, Facebook. The walled gardens. As more people got on the internet (it was previously just us nerds running Linux, IRC, and Bittorrent), focus shifted and concentrated into the platforms. Due to the ease of Facebook and the fact that your non-tech friends were there, people not only stopped publishing, but they stopped innovating in this space entirely. There are a few holdouts, but it's nothing like it once was. (No claims of "you can still do this" will bring back the palpable energy of that day.)
Google later delivered HTML5, which "saved us" from XHTML's strictness. Unfortunately this also strongly deemphasized the semantic layer and made people think of HTML as more of a GUI / Application design language. If we'd exchanged schemas and semantic data instead, we could have written desktop apps and sharable browser extensions to parse the documents. Natively save, bookmark, index, and share. But now we have SPAs and React.
It's also worth mentioning that semantic data would have made the search problem easier and more accessible. If you could trust the author (through signing), then you could quickly build a searchable database of facts and articles. There was benefit for Google in having this problem remain hard. Only they had the infrastructure and wherewithal to deal with the unstructured mess and web of spammers. And there's a lot of money in that moat.
In abandoning the Semantic Web, we found a local optima. It worked out great for a handful of billionaires and many, many shareholders and early engineers. It was indeed faster and easier to build for the more constrained sandboxiness of platforms, and it probably got more people online faster. But it's a far less robust system that falls well short of the vision we once had.
I've posted a lot on HN about the semantic web in the past [1, 2, 3, 4, etc]. I'm extremely happy to see it have a resurgence of renewed attention.
Semantic Web was going to be a mechanism to distribute information in a reusable way, but its rise was poorly timed with the emergence of the platforms. If it had come into full swing five to ten years earlier - basically the start of the web - we'd be using the tech now. It's still the right mindset to get away from Facebook, Google, Reddit, Twitter, etc.
There's no reason this comment should live on HN or Reddit or whatever. It could be shared over a form of HTTP federated aggregation or directly via P2P with lots of semantic metadata and interest graph markup.
People you like would have signed profiles that tell you where they publish so that you can subscribe. You can follow their interest graph but use your own algorithm to filter and rank content.
Facebook just took the wind out of the sails for a decade and a half.
I worked on the Semantic Web. It has so many fatal flaws, that I am amazed in hindsight that I didn't see them back then.
Berners-Lee was successful with the Web because it was not an academic idea like Nelson's and Engelbart's hypertext, but it was a pragmatic technology (HTTP, HTML and a browser) that solved a very practical problem. The semantic web was a vague vision that started with a simplistic graph language specification (RDF) that didn't solve anything. All the tools for processing RDF were horrendous in complexity and performance and everything you could do with it could typically be solved easier with other means.
Then the AI-people of old came on board and introduced OWL, a turn for the worse. All the automatic inference and deduction stuff was totally non-scalable on even toy examples, let alone web scale. Humans in general are terrible in making formal ontologies, even many computer science students typically didn't really understand the cardinality stuff. And how it would bring us closer to Berners-Lee vision? No idea.
Of course, its basic assumptions about the openness, distributedness and democratric qualities of the Web also didn't hold up. It didn't help that the community is extremely stubborn and over confident. Still.They keep on convincing themselves it all is a big success and will point at vaguely similar but successful stories built on completely different technology as that they were right. I think this attitude and type of people in the W3C also has lead to the downfall of the W3C as the Web authority.
This. The Semantic Web produced some interesting technologies that lost their cool factor and dissolved into the background (SPARQL, graph DBs... etc.). Ontology everywhere simply never caught on because it wasn't worth it.
This co-opting of term by blockchain-backed everything is for something actively dangerous as opposed to simply too little bang for too much buck.
It was a lot of work for no real benefit. Nobody cared to parse that stuff or display it in a popular, customizable way, unlike RSS or Atom, both of which also died.
Eventually Google's natural language-esque search just got big enough that no other interface to the web really mattered anymore, and their speciality was in parsing unstructured text & messy HTML into simple phrases. Better semantics might've helped other spiders, but by that point nobody cared about other spiders anymore.
For things like social sharing, OpenGraph had the commercial support of Facebook and Twitter and was far simpler to implement. And other communities, like MediaWiki, used RDF only for the low-hanging fruit (Wikidata) while the most valuable info was still locked behind freeform blobs of text (Wikipedia).
The semantic web took more effort to implement than the crap it usually describes. Most humans just don't really want to waste time classifying stuff. The communities that do (science, pirate communities, libraries, etc.) already have their own classification schemes. There was just no need for another web-only classification scheme, no popular desire for it, no sufficient commercial interest behind it, no end-user advantage of using it over HTML... is it surprising that it failed? It tried to solve a problem nobody really had, using a solution that was quickly eclipsed by machine learning. And for the few actors (search engines, social networks) who actually wanted effective classifications, their own algorithms were both more effective and more private, not relying on/enabling their competitors. Open classification excites librarians and archivists, maybe, and nobody else.
I think you mean when semantic web was the next Big Thing. Better search would have been a nice side effect of a semantic web.
I was a big enthusiast of all the potential a semantic web could have brought. In my opinion due to the rise of social networks, content got heavily centralized, and it wasn't in the centralized platform's interest to annotate it, or allow others to consume it. For a short period of my time on the internet, when most in my close circle of friends, had WP blogs and blogrolls in the sidebar, you'd get at least some FOAF annotations on those links. I could complain for a whole afternoon about the clunkiness of supporting software and technologies (RDF, triple stores, ontologies, graph databases, etc), as it wasn't easy as a coder to hack on these technologies developed by consortiums.
As far as semantic search goes. I think that due to the heavy SaaS-ification of software, now there isn't even an incentive to create better search tooling. I know the landscape of search systems is huge, and while there is no way for me to assess all existing software, I've just stuck with the the tried and tested Apache Solr (or ElasticSearch) on projects I worked on. And those are not easily tweaked into semantic search engines.
My experience with Google Search for the past few years is that results based on their knowledge graph have been gamed heavily. There's a lot of junk in those results that is very much adjacent to the keywords I'm searching. You'll see the common suggestion on HN as well, when you search for a product review to also include news.ycombinator.com/reddit.com in your query based on the type of product you're looking for.
What happened to the Semantic Web was Machine Learning and NoSql databases. Even if the Semantic Web had been a good idea, it took a lot of work to get any benefits. Machine learning produced Big Wins For Free, or at least Comparatively Cheap. And they produced them from big piles of unformatted data requiring no standards meetings or agreements beforehand.
I felt (and said at the time) that the Semantic Web wasn't sufficient to achieve its own goals. The language they chose wasn't powerful enough to express sufficient semantics to enable the kind of data integration and integrity that they wanted. The result was ontologies that still required a lot of negotiation before you could start working -- and then provided little benefit.
So the semi-structured world instead picked NoSql databases, which promptly became full of impossible crap, but at least you could Move Fast And Break Things. And people took all that crap and ML'd it to get something -- what, exactly, is unclear, but it was a thing.
I'll note that I pursued ontologies with a more rigorous standard, and I couldn't get any traction, either. The up-front expense was too high, and I never managed to convey the story about how much good it would do you in the five-to-ten year time horizon. Nobody wanted to hear that. I still think it was a better approach than the Semantic Web, but in the end people chose flexibility over interoprability.
Semantic Web failed because it was neither Semantic nor Web. It wasn't smart enough to be Semantic, nor agile enough to be Web.
I've been waiting for years for the emergence of the original Web 3.0 - the semantic web [1]. A web of machine readable data based on ontologies. How did it get redefined as a web on top of append-only blockchains?
> Imagine hovering over a UI element and seeing who implemented it and when, what project it was part of, why the project was initiated, and what kpis and goals it contributes to.
That's exactly what we are building at Field 33[0] with a package manager for ontologies (Plow[1]) as an underpinning to get a good level of flexibility/reusability/colaboration on all the concepts that go into your graph.
------
> Why isn’t semantic web more popular inside companies?
As part of building Field 33 we obviously also asked ourselves that question.
My rough hypothesis would be that ~10 years ago semantic tech didn't provide tangible enough benefits, and since then got left behind in the dust by non-semantic tech.
That caused a tech chasm that widened and widened, where the non-semantic side became a lot more accessible with quasi-standards (REST) and new methods of querying data for frontend usage (GraphQL), while the status quo of the semantic web space is still SPARQL (a query language full of footguns). Same thing goes for triple stores (the prevalent databases in the space) that roughly go through the same advancements as RDBMs, just at a much slower pace.
It also doesn't help that most work being done in the space comes from academia rather than companies that utilize it in production scenarios.
There is quite a nice curated list of problems/papercuts about the semantic web/RDF space[2].
Overall, despite the current status quo, I'm quite optimistic that the space can have a revival.
> If the Semantic Web had a chance to grow before the giants of tech emerged, we might be looking at a vastly superior internet today.
I worked pretty closely with people doing a lot of semantic web work in the 2006ish era, before Google and ML dominated tech. Even then it was nearly impossible to find a single, useful example of the semantic web in action.
The reasoner a la something like OWL is the true heart of the semantic web, but that never got solved at the scale necessary for the web. For any practical problem or even interesting side projects semantic web technologies didn't offer anything that you couldn't build better using existing tools.
People made RDF data stores because they wanted to use RDF, but I don't recall any interesting demos of semantic reasoners.
You could build an ontology in RDF [1] or OWL [2] and link the grammars together.
FOAF [3] was a means of describing your friends and social connections, and it allowed rich annotation.
No tools were ever built around this because by the time it started taking off, Facebook and MySpace were already a thing.
The great thing about these technologies is that the graph is public and you could write tools to ingest, export, and exchange the information. We never got to that, though.
The technology was centered around graphs and triple stores, allowing you to apply subject-predicate labels to anything. The URI became a central node, document, and ontology identification scheme.
There were languages like SPARQL for querying rich triplestore databases, but they never really took off. Again, Google and Facebook were already huge by the time these started to mature.
If you think the promise of blockchain is exciting, you should take a look at what the Semantic Web could have been...
[1] https://www.w3.org/TR/rdf-primer/
[2] https://www.w3.org/TR/owl2-primer/
[3] http://www.foaf-project.org/
reply