Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
New York Times doesn’t want its stories archived (theintercept.com) similar stories update story
67 points by kakokeko | karma 360 | avg karma 4.8 2023-09-17 06:05:45 | hide | past | favorite | 97 comments



view as:

Doesn't surprise me. NYT is known for the quick takes and quiet edits.

Citation needed. They are much more well known for printing retractions and edits, e.g.

https://www.nytimes.com/2020/12/18/pageoneplus/editors-notes...


Never wrong for long?

Sadly consumers reward this behaviour.


Sounds like the BBC.

I was writing a script a few weeks back that polled BBC articles and does a git commit to a repo every time the text changes. One article had 6 edits within the first 20 minutes of it being published which changed the entire context of the thing.

Then I lost interest in it.


That sounds really interesting and a reflection of the BBCs generally worsening journalism. Do you have the script and repo available publicly? It would be interesting.

I would be embarrassed to publish it in its current state if I’m honest as it was written when I was drunk, cynical and after a breakup.

Please take the idea in concept and do a better job that I did!


I’m tempted! How did you store the content, as raw HTML or plain text? I imagine raw HTML would make automated diffs harder.

It was python so I used Beautiful Soup to scrape and extract just the text.

Not surprising, It tries to push ideas that fall out of favor and look unhinged in hindsight.

A good example of this involves calling the Iraq invasion noble. More examples in the thread.

https://x.com/alanrmacleod/status/1702336001778278816?s=46&t...


This was an opinion article though right? The NY Times seems to have always had a fairly diverse range of opinion writers. I know I have even read criticism recently saying that the times shouldn’t be giving such writers a platform, particularly when the writers are on the opposite end of the political spectrum usually associated with the paper, but I personally find it commendable. Mainly if you have an opinion section, but all of the writers have roughly the same opinion, that only contributes to the “echo chamber” effect. With that given example of the Iraq war being noble, even though many (but certainly not all) would strongly disagree in hindsight, it was definitely not a fringe opinion at the time. Would it really be better to have an opinion section that didn’t include popular sentiments at the time of publication, even if many readers disagree with them?

Obviously there’s a lot of grey area here, and I don’t think The NY Times always strikes the right balance, but the example of publishing a pro Iraq War opinion in 2003 doesn’t seem particularly unhinged.


If you step away from the tin foil for a moment, it becomes easy to realize that articles understood to be politicized or to spawn arguments which inform people's idea of what the New York times is, are in fact a small sliver of their overall reporting. And if I were to get actual examples of the everyday ordinary reporting, it wouldn't easily fit into any partisan narrative that casts New York times as singularly and explicitly doing work of pushing "ideas that fall out of flavor."

Just looking at my RSS feed, some typical examples are a story about how Seneca Meadows landfill is going to potentially be expanded, how the earthquake in Morocco damaged historic sites, the fact that engineers had concerns about Libya's dams before they collapsed, and so on.

What becomes clears there's a wide swath of ordinary reporting, that has no conspiratorial explanation, that constitutes the bread and butter of what actually is done by real newspapers.

It's like the problem with global warming deniers, who think that it's all about secret conspiracies, smoke filled back rooms and briefcases full of money. It has no explanation for the mundane everyday work of scientists who, say, study algal blooms or who update tables of spreadsheets showing the parts per million of nitrogen in the water. There's too much tedious normal work, and there's no explanation for it other than it just genuinely being the ordinary normal work of a legitimate field.


I can’t help but wonder if it is really about them not wanting ways around their paywall and are just using this as a convenient cover

Saying it's for the paywall and not about silently updating articles would be more respectable, imo.

They haven’t said why they made this change at all, this article is just speculating about the desire to be able to make silent edits.

The reasonable explanation would be to preserve their paywall. You know, that people tend to use “archive” sites to get around (e.g. archive.is).


I think if we’re going to judge motivation a key question is “did they specifically block the IA user agent?” or “did they specifically approve browser user agents and disallow the rest”?

Not defending it but I can absolutely see them blocking an absolute ton of user agents in the aim of protecting the paywall, rather than targeting this bot specifically.


Wouldn't a reasonable compromise be to deny archiving of articles that are very recent? Let's say a month of age. That closes the paywall loophole.

It's not even about the paywall IMO, but rather about preventing the archiving of versions before they were edited.

How about archiving them (every version) but not allowing access for say a month.

Early and late editions of newspapers were archived - at least in the U.K., and stored at the British library. Every version published should be archived, by an independent organisation (like the BL).


> archiving them (every version) but not allowing access for say a month

If the purpose is archival, the embargo must be at least one year. A lot of Times content is not breaking news, but investigative journalism, analysis, recipes—stuff which doesn’t lose value quickly.

> newspapers were archived - at least in the U.K., and stored at the British library

New York’s libraries have vast microfiche collections to which nobody objects. Going to the library to read the newspaper simply didn’t compete with the paper the way IA does.


This is going to backfire on them Streisand style. The more aware of this that people become the more likely it is that other people will take it upon themselves to preserve stories for IA.

I kinda feel it's mostly because of the paywall

As much as I like to read their articles for free - why should they?

They produce something of value and the standard way of making that happen in our society is to ask customers to pay. This enables them to put up the means of production and for example pay journalists and engineers.

Most of the work I do is also not free.


Is there any evidence anyone is using the internet archive to bypass the Times’ soft paywall?

Internet Archive is slow as hell. That would be a miserable way to read the Times. There are various higher performance ways to achieve the same result (eg disabling JavaScript for nytimes.com).


Internet Archive links are found in comments on HN all the time, not just for the New York Times, but for nearly any site using a paywall.

You are confusing the Internet Archive and archive.ph, which are very different things


And archive.is which is the one I see here the most.

Ok now I understand so many of the replies in this thread.

For anyone else reading thei who doesn’t know: Internet Archive is archive.org aka the Wayback Machine. It takes like 10 seconds to load a page but has years old versions of pages.


I for one have done this and there are websites and extensions that do exactly that, take a url and return the archive url to bypass the paywall. As much as I love internet archive i don’t think it is debatable people use it for this purpose.

A tidy argument, complicated by the fact that the NYT justifies its existence at least as often in terms of public good as of selling a product.

> the NYT justifies its existence at least as often in terms of public good as of selling a product

It can be both. Public goods don’t have to be accessible to everyone to benefit society.


> justifies its existence at least as often in terms of public good as of selling a product.

I’d be troubled by that if they were also taking public money to do so (ie state subsidy), but they aren’t. So what they claim is kinda meh.


They shouldn't. But the Times is also notorious for changing things after-the-fact to better fit an editorial bias or to groom their own reputation. If they are indeed 'the newspaper of record' then unfortunately they need to be held to account for what I personally view as a violation of their duty to the fourth estate.

They still have a print edition.

Is this because so many of their articles don't look good in the light of history ?

I have watched with sadness as the NYTimes has self-destructed over the past couple decades. I think it comes down to two main reasons.

1. Tech lag/resistance. They resisted digital transformation for too long and didn't understand what to do about that. By the time new media competitors had sucked up all their readers, it was too late and expectations around price and quality had shifted too much. The Times had plenty of opportunities BTW to partner with tech but were too arrogant/prideful.

2. Cultural irrelevance. The NYT is out of touch with the concerns and values of a significant portion of the American population. They were on their assess having completely missed the populist Trump movement in 2016, and made a putative apology as 'the newspaper of record'. They then promptly fired their public editor and said 'twitter was enough to gauge the cultural zeitgeist' lol.

It used to be I could read the NYTimes, WSJ, and FT to feel sufficiently informed about the political & cultural issues I cared about. That is no longer the case.


The NYTimes stock is doing fairly well.

And so is Monsanto's and Nestle's and Exxon's.

The main reason their stock has been doing well has been the role of Trump in driving hardcore blue party partisans into subscribing to paywalled media outlets like the NYT "to resist Trump". Until 2016/2017, pretty much all of legacy media was on a trajectory towards bankruptcy but that changed once Trump won the 2016 election.

> By the time new media competitors had sucked up all their readers, it was too late and expectations around price and quality had shifted too much

Isn’t NYT one of the most successful media businesses out there? Don’t get me wrong, the bar is low, but that’s because the entire industry was decimated by the internet. IIRC NYT was one of the first to pursue a paywall subscription model so in that sense you could argue they were far ahead of their competitors.

> plenty of opportunities to partner with tech

What were they? Genuinely curious. I can’t think of too many examples of media partnership with big tech that has ever worked out for the media company involved so I’m curious if the opportunities they turned down were ever worth it.


I worked there for two years.

No they are not the most successful media business out there, and plenty of other entities were experimenting with and succeeding with paywalls long before the Times. The only reason they've benefitted in recent history was thanks to the hyper-polarization that happened after Trump's election, and they've definitely leaned into that.

Re: other opportunities, the NYTimes has been approached by multiple tech companies including AAPL under Jobs' watch. An example of big tech/media partnership working out well comes from the related Pixar magic.


> only reason they've benefitted in recent history was thanks to the hyper-polarization that happened after Trump's election

The Times broke away from the crowd long before Trump. It was arguably one of maybe a handful of American newspapers that made the digital transition successfully.


Just going by stock price alone: before Jobs’ death NYT stock was around $8. Today it is around $40.

We’ll never know the outcome of a Jobsian Apple partnership but I struggle to see what has happened as a mistake. Judging by stock price the company seems to be doing fine without President Trump (and their cooking, games etc seem like sensible diversifications) so again, difficult to see the failure.

IMO a better comparison than Pixar is The Washington Post. Bought by Bezos, bundled with Prime membership etc… should have set the world alight. I thought it was game over for the likes of NYT. But doesn’t seem to have been that successful. These things feel far from guaranteed.


Shareholder value is probably the last measurement I'd use when it comes to assessing the value of a company.

Operation Mockingbird never ended. Full stop.

(2010) https://weirdshit.blog/2010/07/23/cointelpro-operation-mocki...


It’s always funny how many tech people on here totally devalue others work and think it should be given away for free.

Obviously they don’t want their stories archived just like most tech employers don’t want their code archived.


code is not an end product. a story is.

And therefore stories need not be compensated?

Who said that?

Makes no sense. Are building plans an architect produces an end product? There are multiple end products as the lifecycle is very long.

Did you actually read the article?

Unlike its paper counterparts, online papers are subject to article edition or outright article deletion.

That's what at stake.

Remember Orwell's 1984's newspaper article edition on the whims of the party's current views?


> online papers are subject to article edition or outright article deletion

The status quo of the Internet Archive serving as a paywall workaround obviously doesn’t work, though. And plenty of public libraries archive newspapers. (To say nothing of the print edition.)


So what? Code gets deleted too. Why do you think you’re entitled to versioning?

These articles not only record history but create history, in part by swaying public opinion. To edit an article years later so that its not what was originally published is a disservice to historians.

Code from big tech definitely sways public opinion just as much or even more so by controlling what is seen and when as well as by who. So you agree then code should also be free?

not the same thing. more accurately, while i rarely have the need to run decades old software, i should have the right to. look at attempts to preserve old games. it's the same story. it should be legal to archive copies of old software. ideally with the source. that doesn't mean that the source should be free now, but it should most definitely be available once the copyright expires.

So you agree old code should be free?

once the copyright expires it is obviously going to be free, yes. but someone needs to preserve the code until then.

Ok. You can give away your code now so it can be “preserved” and go into the public domain in 70 years. I prefer to get paid instead since freely available code is not going to be as profitable.

you do not need to give away your code in order to allow it to be preserved. it could be locked somewhere, only to be made public after the copyright expires.

It’s both things, though. You can’t maintain an archive of content without bypassing the NYT paywall. To me the actual issue is a profit-driven business producing goods of important public value. There’s an inherent contradiction in there that doesn’t seem easy to solve.

Code has value over time where as NYT content does not, unless it’s to make a case why one shouldn’t put to much weight in what it publishes.

Code is facts about an organizations business. To the extent factual information is created by the NYT it most heavily skews toward sports scores and weather which is not the NYTs property to preserve.

I don’t have an issue with their paywall but I think the most valuable part about their content after a week or so is to point out how wrong they typically are. My guess is, removing the ability to call BS is the motivating factor here.


> Code has value over time where as NYT content does not, unless it’s to make a case why one shouldn’t put to much weight in what it publishes. Code is facts about an organizations business. To the extent factual information is created by the NYT it most heavily skews toward sports scores and weather which is not the NYTs property to preserve. I don’t have an issue with their paywall but I think the most valuable part about their content after a week or so is to point out how wrong they typically are. My guess is, removing the ability to call BS is the motivating factor here.

This is a good example of what I’m talking about. Tech bros devaluing other professions work, lol.


If most of what was written wasn’t utter tripe the “tech bros” would probably value it more, lol.

If the articles didn't preserve their value over time, then people wouldn't care to archive them.

That's not an accurate summation of anti-copyright sentiment.

The idea is that people should be paid for the work that they do and they shouldn't be able to continue getting paid after they've completed the job. e.g. If I employ a builder to build a wall, I would expect to pay them for their time, effort and materials, but when the wall is finished, I don't want to continue paying them and their descendants.

The fundamental idea behind copyright is to provide an extra incentive to get people to create original works and in return, those works end should end up in the public domain. However, with copyright terms being so extreme, many works can disappear before they can enter the public domain which is fundamentally breaking the "contract". If a newspaper doesn't allow archiving to preserve historically interesting documents, then they shouldn't enjoy the benefits of copyright as they won't be fulfilling their side of the deal.


And how do you reconcile your view with modern software development? Let me guess, software is different because it requires “maintenance”?

How convenient that most creatives except software developers result in having to give away their stuff lol.

Also the “contract” you’re referring to doesn’t seem to be rooted in any law, but rather your opinion.


Are you not familiar with open source or do you not understand it?

Without referencing a specific license your comment isn’t really useful.

> And how do you reconcile your view with modern software development?

I mainly work with Linux and open source software, and though I'm not a software developer, I'm more than happy to share any BASH scripts that I've written with anyone that wants them.

> Also the “contract” you’re referring to doesn’t seem to be rooted in any law, but rather your opinion.

From https://en.wikipedia.org/wiki/Copyright_law_of_the_United_Ki...

> The modern concept of copyright originated in Great Britain, in the year 1710, with the Statute of Anne. This Act prescribed a copyright term of fourteen years, and let the author renew for another fourteen years, after which the work went into the public domain. Over the years, additional acts and case law steadily refined the definitions of what could be protected, including derivative works, and the degree of protection given.


New York Times is an American company

From https://en.wikipedia.org/wiki/Copyright_law_of_the_United_St...

> United States copyright law traces its lineage back to the British Statute of Anne, which influenced the first U.S. federal copyright law, the Copyright Act of 1790. The length of copyright established by the Founding Fathers was 14 years, plus the ability to renew it one time, for 14 more. 40 years later, the initial term was changed to 28 years.

and

> The goal of copyright law, as set forth in the Copyright Clause of the US Constitution, is "to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." This includes incentivizing the creation of art, literature, architecture, music, and other works of authorship. As with many legal doctrines, the effectiveness of copyright law in achieving its stated purpose is a matter of debate


Not relevant. - wiki pages aren’t laws.

Are you denying that copyright laws exist?

Reminds me of the reaction whenever the controversy over training sets of artwork rears its head, honestly.

I find it hard to make sense of the idea of for-profit, corporate journalism or why we find the practice acceptable. For example, consider the following seemingly mundane fact: The New York Times does not reports critically on itself. Its investigative journalists don't probe into the newspaper's internal affairs, and you'll never see a headline like "Inside the Machinery of Deception: How the New York Times Manipulates Public Opinion" on its front page. In a way this is completely unremarkable and to be expected. But the more I think about it, the less it makes sense. Aren't we supposed to think of the Times as engaged in the disinterested pursuit of truth? Why should one particular region of reality (one of the most prominent American institutions no less!) be arbitrarily closed off to its own scrutiny? Should that not call into question its claim to disinterestedness?

Consider how the NYT might try to justify not critically reporting on itself. It can't say (1) "any wrongdoing on our part is not newsworthy": on the contrary: it'd be the most newsworthy thing in the world! It can't say (2) "our journalists don't have resources to investigate the topic": on the contrary: that's the one topic its own journalists are most well-positioned to investigate! The only possible remaining defenses I can see are: (3) "we simply don't engage in unethical practices, so there's nothing to report" or (4) "what do you expect, we're not gonna do what's bad for business". Needless to say, either defense would make it very hard to continue taking seriously its claim to disinterestedness.


> Aren't we supposed to think of the Times as engaged in the disinterested pursuit of truth?

No.


The Nytimes and its employees are deeply concerned with the prestige of the paper.

In fact, much of its terrible strategy when it came to digitization, major political moments, and its own journalistic integrity (The Jayson Blair affair comes to mind) stems from this navel gazing.

Customarily the ombudsmen is supposed to do what you describe; the NYTimes got rid of theirs in 2017: https://www.nytimes.com/2017/06/02/public-editor/liz-spayd-f...


So what’s your proposal for funding the work that goes into journalism?

Doing it commercially is apparently problematic, but so is the government running the show, and nonprofits and volunteers doing it also have problems.

Guess it’s all problematic, we should just not have any journalism


Aside: “Problematic” is such a terrible word, uniformly intended to cast aspersions.

It’s reality: everything has positive and negative qualities in various proportions. Instead of implying that something is “bad” by virtue of being “problematic”, why not simply state what the specific problem is? Because then you can’t shit talk the entire affair, I suspect.

Perhaps it was initially intended to be useful in this way, but it has been co-opted to non-specifically disparage.


After what was probably the most impactful case of news media influencing public opinion in recent memory–the NY Times / Judith Miller reporting on Iraqi weapons in 2001-2003—the NYT did in fact publish self-critical pieces.

https://www.nytimes.com/2004/05/26/world/from-the-editors-th...

https://www.nytimes.com/2004/05/30/weekinreview/the-public-e...

https://www.nytimes.com/2005/10/16/us/the-miller-case-a-note...


You're missing "recusal" as a reason, saying they can't report objectively on it due to closeness to the sources.

It's clear we're seeing a revolution in funding models for media companies of all kinds from Spotify to Motion Picture/TV Producers and Journalism/Journalism-like enterprises.

The core commodity of these businesses used to be Information, the Tapes/Records/Printed Newspapers. Now that such information has been quite literally made free by the advent of digitization along with the internet, these entities need to find new ways to create scarcity from which to profit. Some respectful ways that the times may do this:

'Private' Investigative Journalism: Open an office where companies can pay their skilled, resourceful journalists to promptly investigate a lead.

Physical Installations: Use immersive media to create a multi-dimensional storytelling experience. Sell tickets if it's 'fun' enough, or seek corporate partnerships that benefit from this message who want to sponsor.

Sell Premium Access of 'specialty' analysis with SLAs to other Corporations: Reuters sells Refinitiv/Westlaw, Bloomberg has their Terminals... Publicly publish the data every hour but within 50ms to paying organizations.

I don't honestly see it as my job to find a business model for these entities, but I do se a bunch of opportunity in B2B relationships. What I be

I think it's a sad but telling sign the so-called leadership at these collectives is dedicating effort to the paradoxical, Sisyphean task of stopping casual readership from accessing/sharing the content they worked so hard to create instead of seeking new avenues that exist in abundance.

As others commenters have noted, having a constant, public and freely auditable record is the only way to keep these entities honest.


It's clear we're seeing a revolution in funding models for media companies of all kinds from Spotify to Motion Picture/TV Producers and Journalism/Journalism-like enterprises.

The core commodity of these businesses used to be Information, the Tapes/Records/Printed Newspapers. Now that such information has been quite literally made free by the advent of digitization along with the internet, these entities need to find new ways to create scarcity from which to profit. Some respectful ways that the times may do this:

'Private' Investigative Journalism: Open an office where companies can pay their skilled, resourceful journalists to promptly investigate a lead. Ideally the results would be published afterwards but I don't see a reasonable mechanism for enforce that.

Physical Installations: Use immersive media to create a multi-dimensional storytelling experience. Sell tickets if it's 'fun' enough, or seek corporate partnerships that benefit from this message who want to sponsor.

Sell Premium Access of 'specialty' analysis with SLAs to other Corporations: Reuters sells Refinitiv/Westlaw, Bloomberg has their Terminals... Publicly publish the data every hour but within 50ms to paying organizations.

I don't see it as my job to find a business model for these entities, but I do se a bunch of opportunity in B2B relationships. I hope more knowledgeable folks can do better...

I think it's a sad but telling sign the so-called leadership at these collectives is dedicating effort to the paradoxical, Sisyphean task of stopping casual readership from accessing/sharing the content they worked so hard to create instead of seeking new avenues that exist in abundance.

As others commenters have noted, having a constant, public and freely auditable record is the only way to keep these entities honest.


The New York Times maintains archives going back over a hundred years, there's no risk of loss of information over time, which is why the article leans into the "stealth edits" problem over the loss of information problem. The NYTs doesn't need the IA and has every right to block it.

Ya right.

That's why I choose newspapers that provide a PDF version of their newspaper :)

I'm not sure what the outcome was but at some time Portugal's Expresso's archives were deleted by a cyberattack. At the time (never rechecked) I got the feeling the older articles were lost.


That relies on NYT still being in business to make their archives available, among other failures modes.

Historians, genealogists, etc. find it valuable to look at news reports from several hundred years ago. It's important to society for a prominent paper's articles to be archived somewhere independent of the news organization itself.


> It's important to society for a prominent paper's articles to be archived somewhere independent of the news organization itself

Plenty of libraries archive the Times. It still produces print and microfiche editions.


stealth edits

This is one of my biggest pet peeve's with a media. They will get something flat out wrong, stand beside it, and then much later go back and secretly change it without putting anything that it was changed. I would trust an Archived New York Times story over one that was in the NYT archives.


I wouldn’t attribute this to some desire to avoid accountability to stealth edits. Sure, archiving and accountability may be lost, but this is about money.

NYT is behind a paywall. Internet Archive is a well known way to get around that paywall. Managers made a profit based decision but don’t want to confirm it, because that would make the NYT look too much like a profit driven corporation and detract from their journalistic image. But they care about the money angle way more than they care about someone knowing if they edited a section title after publishing.


This is dumb.

The New York Times will send the entire text of any article to any stranger over the internet.

Then they try to use soft conventions and to get people to subscribe.

* they will block your access to the text you already downloaded via JavaScript after X reads. Smart people turn off nytimes.com JavaScript in their ad blocker.

* they will block archiving via robots.txt. This is basically a social request document. Maybe some day it will have legal force but it simply does not. Internet Archive could simply choose to ignore it.

This is all fine for encouraging people to buy a subscription. But it’s not going to stop people who know what’s actually going on. If the Times wants to protect its IP it should require a login to view content - always.


SPJ code for journalists: An ethical journalist acts with integrity.

So, what happened to NYT and WaPo? Are they saying that they are full of unethical journalists?


As Judge Judy says" If you tell the truth, you don't need a good memory."

The NYT is notorious for lying and changing its stories time and time again. They don't want people saying "Hey, you printed this six months ago, and now you're telling us the opposite."


Legal | privacy