Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
W3C’s transfer from MIT to non-profit going poorly (twitter.com) similar stories update story
159 points by andruby | karma 4475 | avg karma 4.21 2022-12-17 13:33:17 | hide | past | favorite | 71 comments



view as:


Mastodon version without ads, tracking, or GA or Twitter analytics (and with discussion/feedback):

https://mastodon.social/@robin/109524929231432913

(I'm not sure we should even be linking to the Twitter copy, given that the author's name there is giving attribution to him as "@robin@mastodon.social", so surely that should be considered the authoritative version, and it doesn't require delegating to any third-party aggregator hack.)


The Mastodon version also has this testy exchange with Google's Chris Wilson, a standards manager who's also on W3C: https://mastodon.social/@cdub/109525438620997614

Man MIT seems to really be losing its clout (PR capital) especially in the last decade or so. (And I am not usually one to participate in the “dumping on prestigious institutions” meme that’s grown in popularity as well.)

This is obviously one sided, but assuming most of this is factual… not good.


MIT has a beautiful campus and many great papers and organizations have come from its past, but it has this reputation as the ultimate CS research hub which houses all of the smartest people and creates all the discoveries and inventions of the future.

My understanding is that besides its reputation and the fact everyone knows about it, MIT is fundamentally not really different than any other great tech university. And many big universities are starting to turn more into businesses. The "admins have seized the Ivory Tower" (https://news.ycombinator.com/item?id=33856624&ref=upstract.c...) applies to MIT as well.


It apparently started out fundamentally different though https://freaktakes.substack.com/p/a-progress-studies-history...

Having graduated from MIT, I would say that the Great Dome and other buildings surrounding Killian Court are beautiful. I don't particularly care for the modern architecture that makes up the rest of the campus, but I'm also not a connoisseur of architecture.

At least 20 years ago, MIT's greatest strength was its student body and its culture. You got the sense everyone was striving to learn as much as they could, and most students reveled a bit in their nerdiness. In high school, I took classes at a well-regarded state school, and didn't get the same sense of intellectual hunger. IHTFP (simultaneously I Have Truly Found Paradise and I Hate This F'ing place) summed up culture pretty well. You got the sense that you and everyone else had lined up to drink from the fire hose, and were going to struggle through it together, and come out the other side better for it. I have several friends who got grey patches in their hair during undergrad from stress, that went away shortly after graduation and didn't show up again for another 15 or 20 years.

I hope that pressure cooker feeling isn't actually necessary for rigor. I hope MIT has found some way to keep the rigor while being a bit more easy on the mental health of the students. MIT ensured every month had at least one holiday by inserting one fake Monday holiday in each month without a holiday, as a mental health break. I heard the mental health breaks were a result of the high suicide rate in the 1980s. Thankfully, none of my friends committed suicide, but a few friends of friends committed suicide in my time.


There’s also this interesting mix of humbleness and ambitiousness, with the real feeling of “everyone can change the world” by actually doing fantastic things, instead of boasting about their skills. It sorta feels like a place where the engineers are superheroes.

It’s a bit tragic that a lot of times this comes at the cost of mental health, though MIT has gone a long way to improve that.


> MIT ensured every month had at least one holiday by inserting one fake Monday holiday in each month without a holiday, as a mental health break.

They also made freshman year courses pass or fail.


Their handling of the Aaron Swartz case (rightfully) caused a massive hit to their reputation.

That's right. The "Skoltech" program was one such initiative, basically taking hundreds of millions from the Skolkovo Foundation (run by Oligarch Viktor Vekselberg, a member of Putin’s inner circle) in 2010 in return for setting up the Skolkovo Institute of Science and Technology near Moscow. Despite getting a lot of heat for this, MIT kept it going:

The university in 2019 signed a five-year extension of its lucrative partnership with the Russian technology research institute, which has long raised espionage fears among foreign policy experts and the FBI. The extension came just three months after the federal government announced it was investigating MIT’s compliance with reporting requirements for the Russian money it had received in connection with the project.

The article notes MIT only ended the cooperation after the invasion of Ukraine.

https://www.wgbh.org/news/local-news/2022/02/25/mit-abandons...


These kinds of threads are always so hard to decipher. Without disclosing the terms and details MIT is allegedly providing, you just have to assume the one side complaining is telling the truth when there's almost certainly two sides to this problem.

That said, don't get me wrong, I'm always down for a quick pitch fork roast on the internet.


Normally when things have reached the "Air it in public" level of greivance, what is being put out in public is at least factual--just probably not all of the facts.

However, I can quite easily see this happening on the MIT side. Some mid-level bureaucrat who doesn't even know what W3C is will be losing budget, so they're playing hardball assuming the usual level of scrutiny. They're going to get a surprise when they get dumped on by their managers because this suddenly hit a lot of eyeballs and is garnering negative PR for the entire university.


That’s partly because twitter is just about the worst possible place to have a detailed conversation on an issue, really just about any issue.

I have no idea why anyone of any level of technical sophistication or containing halfway decent communication skills makes the attempt. Choose a free blog, write something more substantive, and write a succinct Twitter post to get people aware of it. Or at least do that at the same time you post a balkanized “thread” like the author here and link to the more substantive post in the process.


People aren’t reading random blogs as much as they’re reading social media

Thus: link to random blog from social media.

People don’t click as much either.

Then it’s better to post nothing than something that could be misleading. Or still post the link to a better blog post and if people don’t click then that’s on them, and least the author has put a more comprehensive argument together.

Otherwise why bother writing up research either? People might just read the abstract. Why bother watching a full baseball game? You can just get the condensed highlights and winner afterwards. Why bother watching a movie? You can watch the trailer for a lot of the great bits and then read a synopsis online.

Plenty of people actually do those things and I guess that’s okay too but there value and good reason to put out the whole product too.


This is interesting.

MIT is playing hardball with people's jobs and W3C assets.

W3C is playing hardball MIT's reputation.

I think the fact it's reached the point they're publically talking about this means they're is very little chance MIT is going to be backing down. The real question for me is would US officals allow W3C to move aboard. Could they prevent it? I have a feeling MIT's lawyers have thought alot of this out already.


The W3C always had "branches" in US, EU and Asia. What could be at stake here is to not have an entity in the US anymore.

And I'm wondering if the US goverment which is a bit of a control freak, would allow there not to be any W3C under their control? Especially, if the major players in the tech world are US entities. I could see this just breaking up and ending the W3C more than the W3C having leverage.

Why would the US gov care about something that doesn't matter?

W3C does matter to an extent. Which is why so many mega corps pay people to work on their stuff.

> W3C is playing hardball {with} MIT's reputation.

It's more like softball, if we're being honest. 99.9% of the public doesn't care, and of the small portion of the public who is familiar with both MIT and W3C... I'll just predict that nobody is going to show up and protest, or bring torches and pitchforks, or anything because of twitter threads. Nobody is going to cut MIT's funding because of this, and they'd have to really cut in order to make MIT reconsider dumping what must be a money-loser for them already.

Really playing hardball with MIT's reputation would involve getting Tim Berners-Lee in front of the mainstream press to talk about this.

> MIT is playing hardball with people's jobs and W3C assets.

That is hardball.


> It's more like softball, if we're being honest. 99.9% of the public doesn't care, and of the small portion of the public who is familiar with both MIT and W3C... I'll just predict that nobody is going to show up and protest, or bring torches and pitchforks, or anything because of twitter threads. Nobody is going to cut MIT's funding because of this, and they'd have to really cut in order to make MIT reconsider dumping what must be a money-loser for them already.

The thing is, the 0.1% that do care are the people MIT care about what they think. MIT don't care what most people think. They're not giving them money, they're not giving them status, they're providing MIT with nothing. Which is kind of why those people don't care. But the people do care are giving them money and status and other stuff.


> "The real question for me is would US officals allow W3C to move aboard."

I think moving abroad would simply massively backfire on W3C - it would turn them from an org struggling to stay relevant into a completely irrelevant org immediately.


My understanding is that ivy league schools have impossibly huge coffers, and this seems to boil down to money.

Or perhaps MIT is offering a bad deal on purpose to sink negotiations?


MIT isn't in the Ivy League. The Ivy League, which is an athletic conference, consists of Brown University, Columbia University, Cornell University, Dartmouth College, Harvard University, Princeton University, University of Pennsylvania, and Yale University.


Endowments are independently run and the last thing the money men would do is give the admins or academics a say, or they’d fritter it all in short order, like how Larry Summers lost Harvard a cool billion dollars through ill-judged investment strategies for its operating funds.

I don't think they do? They have large endowments, but they have to preserve those mostly, and the number of things they have to pay for is huge. They're still run like an organization that has to be accountable to its budget.

Also, MIT isn't Ivy League, technically.


It's worse than that, most endowment funds are earmarked for specific purposes.

Looks like the long time CEO of the W3C resigned in November, 2022. Hmm ....

None

W3C members list at https://www.w3.org/Consortium/Member/List and team https://www.w3.org/People

What would be the impact of the USA part of the team shutting down? The big USA companies will still be there and will keep advancing their agendas. What the rest of the world can do?


What does the w3c do that actually requires money? Are standard editors actually paid? I always assumed that they were volunteering their time on behalf of whatever company they worked for.

For that matter, what liabilities are we talking about here? Hosting a website? Maybe i am just naive, but what else is there?


https://www.w3.org/People has a list of 57 people and what their role is - it's not clear to me if they are all full time paid staff but I think most of them are.

Two people listed as CFO? <shrug>

Wow who knew they had so many full time employees? To be honest I assumed it was entirely volunteers & people employed at other companies, like the C++ standards committee.

Note that OP (Robin Berjon) works for Filecoin, a crypto project that raised a huge ICO and has never really delivered on all of the promised hype. His full-time role is to get Filecoin's projects more embedded into standards like those the w3c oversees. And they obviously have specific opinions about how they'd like to see the w3c run.

I would take a skeptical view of his take on what's happening. The w3c is a very dysfunctional organization and there has been a lot of turmoil internally. Jeff Jaffe who had been CEO for more than a decade quit in November. There are power plays behind the scenes to fill this vacuum.


W3C pushing crypto nonsense is in my opinion killing aby remaining respect for the organization. Stuff like the DID standards earlier this year are nonsense and were heavily objected by maintainers of Firefox and Chrome, but got pushed through anyways. Will they simply ignore W3C recommendations going forward? Seems likely to me. Unfortunately I'm not aware of any other organization working on a widely accepted list of web standards.

https://www.theregister.com/AMP/2022/07/01/w3c_overrules_obj...


I am seeing currently w3c people pushing blockchain stuff in European cloud standardization activities. This really worrying me. This explains to me now why W3C seems so broken.

I never quite understood why blockchain vaporware is so prevalent in europe. It is also popular in the US, sure, but it seems more embedded and at higher levels than in the EU. Is there a reason why? (I'm not necessarily talking about crypto coins specifically, more about blockchain as a technology in general)

What are you talking about?

one or two sources for this would be welcome

This is first account from meetings I attended as part of the big data value association with W3C presenting on technology convergence for data spaces. Alot was on DIDs which seem to be a pillar of the European Blockchain service infrastructure standardized by W3C [1]

[1] https://www.eqar.eu/qa-results/synergies/european-blockchain...


> Unfortunately I'm not aware of any other organization working on a widely accepted list of web standards.

Maybe I'm misunderstanding, but the HTML 5 spec is entirely the work of Ian Hickson/WHATWG (financed by Google), and has been for over ten years. Until about 2018 W3C has merely snapshotted, rubberstamped, and editorialized WHATWG's specs, and has since stopped doing even that, and now W3C's HTML5.2 spec has a banner retroactively redirecting you to WHATWG's current head [1]. W3C is involved via the CSS WG and ARIA still. But yeah, as a standardization body, W3C has failed spectacularly. There's no standard as such; WHATWG's "living standard" is merely a stagnant (yet still unversioned) memorandum of understanding of extant browser vendors (Google Chrome and Google-financed Mozilla plus Safari) to implement features, with a large part of browser APIs not implemented by FF and Safari due to fingerprinting concerns.

[1]: https://www.w3.org/TR/2017/REC-html52-20171214/


Alternate interpretation:

w3c's XHTML was machine understandable. It took great care to introduce rigid and reusable semantics. You could spider the web and directly extract facts without ML or heuristics. It spiritually brought HTML closer to RSS and RDF, and there would have been a lot you could do with that.

Google didn't want anything to do with that future. Search is their moat. They pushed a format that is easy for humans to author, tolerably messy, and knowledgeless for machines. If the format was difficult to extract knowledge from in a scalable and reusable way, Google could keep their search kingdom.

XHTML was a ladder rung into the semantic web. A distributed knowledge graph that could be mined, remixed, and extended with little effort. It was marginally harder to author (the mimetype issue was ridiculous), but extending it was low hanging fruit.

This isn't necessarily how it went down, but the incentives do align. I do think we'd have more automation in the world today if we'd have adopted XHTML.

Now with AI/ML and NLP I think we'll get to this kind of web without a format to rigidly specify semantics. I still have to think that the ideas of semantic web and P2P could have delivered decades ago if Google and Facebook hadn't overwhelmingly pushed the envelope on centralized resources.


HTML5 can be auto-translated into an XML surface syntax. You aren't losing anything compared to the old XHTML, though only the XML syntax can include content from other XML namespaces.

Google even cooperates with other search engines in promoting unified formats to "extract knowledge from in a scalable and reusable way", namely the schema.org standards.


XHTML treated schemas as a first class order of business, though.

It would require people to use extra effort to author a table of "address information" (or whatever). Maybe that would not have happened, but the format also suggested the building of reusable components, browser integrations for semantic understanding, and much more.

The world it alluded to was much brighter for machine understanding and reusability.


Data served in a tree structure and then rendered with client side templating with the ability to create custom elements... those crazy W3C eggheads in their ivory tower, why did they think anyone would want that?

/s


Is the “can be translated into XML” part of the HTML5 spec?

That would be interesting to me, and I wasn’t aware of it until your comment.

Edit: sounds like it’s not a thing:

> The DOM, the HTML syntax, and the XML syntax cannot all represent the same content.

https://html.spec.whatwg.org/multipage/introduction.html#htm...


The non-translatable features are extremely niche, though. To continue the quote:

> For example, namespaces cannot be represented using the HTML syntax, but they are supported in the DOM and in the XML syntax. Similarly, documents that use the noscript feature can be represented using the HTML syntax, but cannot be represented with the DOM or in the XML syntax. Comments that contain the string "-->" can only be represented in the DOM, not in the HTML and XML syntaxes.

I.e. the big thing you'd fail to capture when doing an automated HTML -> parsed DOM -> XML trip is noscript tags. No big loss. Others I can remember are similarly tiny, e.g. some issues around textarea elements containing only newlines as their initial contents. Some discussion at https://html.spec.whatwg.org/multipage/parsing.html#parsing Ctrl+F "roundtrip", although note that most of that is actually about what happens if you construct weird DOMs with JS and try to do a DOM -> serialized HTML -> reparsed DOM roundtrip.

I think it's pretty safe to say in general that the procedure of "parse HTML to DOM, serialize to XML" will preserve everything interesting about a document. Especially if that document is already valid HTML.


Ah, I thought those were just a couple of examples from a much larger set.

Pedantically, yes, there is a fair bit larger set, but they really only get more and more esoteric (and often things that are allowed within the HTML syntax but where the content would be non-conforming). For example:

You can have an element called "foo:bar" in HTML, but you can't in XML-with-Namespaces-in-XML; likewise, you can have an element called "a\u0300" in HTML, but you can't in XML (and the set of possible element names is even more different if you're looking at an implementation of XML 1.0 4th Edition, which most are, rather than the much later 5th edition).

You can't have a comment that contains "--" or ends with "-" in XML. (This is potentially the thing that comes up most often in real world content: people like doing <!------ DON'T TOUCH THIS ------>, which in HTML produces a comment whose value is "---- DON'T TOUCH THIS ----", which you cannot represent in XML.)


> XHTML was a ladder rung into the semantic web. A distributed knowledge graph that could be mined, remixed, and extended with little effort.

The first attempt at that "knowledge graph" thing was the "keywords" meta tag. Its failure should be taken as instructive regarding the difficulty level of the project as a whole.

> It was marginally harder to author (the mimetype issue was ridiculous), but extending it was low hanging fruit.

Silly me, I thought it was more about mandatory strictness vs user-generated content, rather than what mime-type the server had to label it as.


> The first attempt at that "knowledge graph" thing was the "keywords" meta tag. Its failure should be taken as instructive regarding the difficulty level of the project as a whole.

Knowledge graphs are basically just semantic networks, which arguably date back to 1956 (or 300 depending on how you define things)


> w3c's XHTML was machine understandable. It took great care to introduce rigid and reusable semantics. You could spider the web and directly extract facts without ML or heuristics.

It did absolutely none of this. It was an presentation layer built on top of XML, and it did not really touch semantics at all. The W3C's proposed direction for the future of the web was for everybody to write data as XML, and then apply XSLT transforms to build XHTML as the structural presentation layer, with CSS providing styling information on top.

XHTML added very few additional semantics, other than loose concepts that were worse than useless.

> It was marginally harder to author (the mimetype issue was ridiculous), but extending it was low hanging fruit.

It was difficult to author (there were days you could run random websites through the "XHTML Validator", and despite them trying, find plenty of errors). The mimetype issue was that Internet Explorer would not recognize a proper XHTML mimetype, while other browsers wouldn't turn on XHTML mode when the mimetype was text/html.

https://webkit.org/blog/68/understanding-html-xml-and-xhtml/ -- this blog post was written as XHTML's last dying breath, and I think was the nail in the coffin for it.


It is amazing to look back at arguments against XHTML and to realise just how awful they really were.

W3C's XML (and also semantic web stuff) is an excellent example of what not to do as standardization body. XML may be important for enterprises, but what was achieved for the web except a decade-long meta discussing, resulting into HTML taken away (or back if you will) by browser vendors after years of stagnation?

There was zero need for XML; recall XML is just a proper subset of SGML by its original developers, which was used to specify HTML until version 4. And even today, SGML is the only game in town to parse HTML (including version 5, up to minor trivialities) based on an international standard, whereas Hickson's/WHATWG's procedural parsing spec has become unmaintainable and isn't covered by a test suite, precisely because it doesn't follow a formal model, but started from a prose description of what an SGML parser does but lost track. Not a great outcome for a markup language used daily by billions of people; especially since HTML the markup language hasn't really changed for a very long time, whereas everything around it (CSS, JS) had to change drastically to make up for HTML's stagnation and first W3C's, then WHATWG's fuckups.


XML may have failed as a syntax for HTML, but it was enormously succesful in its own right.

W3C has lots of failures and bad standards. I don't think the base XML spec was one of them, even if i think JSON is better for most usecases.


It was successful in the 2000s, but then everyone figure out it’s pointless, and all data exchange in software written in the past decade is in JSON or possibly protobuf.

All things have a time. XML was succesful in its. That it was eventually taken over by competitors doesn't really detract from it being hugely succesful in its time.

Even outside of the web stuff, SPARQL standard is really old at this point and seems like it could really use a revision, but nobody seems to be working on it.

Rdf-star workgroup has just started and they will update some parts of sparql rdf etc. They are also looking at making it easier to update. I.e. living standards like.

Ian Hickson stepped down a few years ago and has started working on Flutter instead (Flutter started as a "what would a browser look like if we removed a bunch of legacy from the HTML spec" project).

It's now maintained by Dominic Denicola, Anne van Kesteren, Tentek Celik, and a few others.


Let me clarify some things about the WHATWG HTML Standard:

The HTML Standard has not been maintained by Ian since 2015. It has since then been maintained by myself, Anne, Simon, and Philip. Affiliations during that time are, respectively: Google, Mozilla then Apple, Bocoup then Mozilla, and Google. (Indeed, mostly only browser companies have been paying people to be editors for web specs!) https://html.spec.whatwg.org/multipage/acknowledgements.html...

But the contributor pool is much larger. We're quite proud of the vibrant (not stagnant) community we've created, which is continually evolving and improving the spec. Contributions come from all corners; many are from people employed by browser engine companies, but others include students, web developers, consultancies like Igalia and Bocoup which various companies hire to work on web standards, W3C staff and members, representatives from server-side runtimes like Node and Deno, and so on. Some interesting pages to peruse might be https://github.com/whatwg/html/commits/main , https://github.com/whatwg/html/graphs/contributors?type=c , and https://blog.whatwg.org/ .

Overall I think it's pretty exciting you can run a successful standards organization like this, getting such high engagement levels despite employing no full-time staff and with operating costs being entirely server bills. Including, no membership fees. Which is perhaps relevant to the OP. https://twitter.com/Hixie/status/1603917371214729216

The WHATWG only includes features in our standards which have multi-implementer commitment: https://whatwg.org/working-mode#additions . This is different than some places where anyone can publish a "standard", even if the target platforms have no intention of implementing it. I don't think this reduces to "merely a memorandum of understanding"; I think it's best when standards reflect reality. Other SDOs can disagree, and that's fine; it's healthy to have a marketplace.

As for the scare quotes around "living standard", you might enjoy https://whatwg.org/faq#living-standard and the follow-up questions.


His bio says Protocol Labs.

Protocol Labs is the corporate entity behind Filecoin:

https://www.coindesk.com/markets/2020/10/15/filecoin-launch-...


Interesting that inside the thread above are threats against MIT, drama claims of MIT costing jobs during holiday season, and even threats to transfer contracts / IP to China.

Seems like not a neutral thread, but posturing and propaganda in its own right from the W3C.


Legal | privacy