The reason for doing so is minimising the risk if an attacker breaks into a server. This is also why systems like ATMs should be patched. Untrusted code execution on servers and ATMs may not be the norm, but it's far from impossible.
Ok, sure, but we have to consider whether that is worth it, and if they can run unpriveleged code, there is a good chance there is a software privelege escalation available anyway.
Of course on shared servers this is going to be a nightmare.
> "couldn't they just use a myriad of other priveledge escalation bugs?"
The idea is to patch all known privilege escalation bugs.
It's rare for an attacker to rely on a single attack vector. They may get to the point where they only have limited access to running code, and want to break out of that sandbox. Spectre/Meltdown may be the route they take to do so.
...at which point that untrusted code gains full access to the only userspace process that matters on that node. If you gain access to run code on a server process why escalate further, you already have access to everything that matters?...
Assuming bare metal. In shared hosting / cloud / VMs it is different.
That only matters if not everything is in the same sandbox. Many non-cloud servers probably either only run a single service, or they run all processes as the same user, which means that a sandbox escape doesn't really matter. You can't lose any security layers you never had in the first place.
This is a great point. Probably someone will argue properly configured apache will not have access to data ... etc. I think the practical reality is in many setups it already does, and what these hardware bugs mean is that the lazy imperfect - effectively single layer - security that probably exists on the majority of servers is now essentially equivalent to the security of systems where the security minded have been incredibly diligent (and at considerable cost) ensuring multi layer security ... etc. So in some sense it is an attack on their value system and their worth/usefulness.
I think this is a fantastic insight. There is a particular mindset of security thinking which compartmentalizes breach impacts on the basis of how much of the security infrastructure itself is compromised. 'well, they get remote code execution, but at least they can't recover passwords', or 'this vulnerability is bad because it allows recovery of temporary TLS keys'. And Spectre/Meltdown seems to turn every vulnerability into one of these 'world shattering' security breaches that mean your secret keys are all exposed.
But for servers, the application-level vulnerabilities that are needed in order to get meltdown or spectre attacks to run are already devastating. Take over a game server process and you own the in game currency and the scores and the ability to ban users, and probably user level login as well. And you have your pick of privilege escalation mechanisms already, probably.
If you’re running an SSL terminating web server in front of your application server (on the same node), these exploits would allow you to read the ssh private key from the front end, AIUI.
Would be funny to see a game server exploit (since this was about Epic Games here) via specially crafted network packages that look like valid game messages. Has this happened before? Seems natural that those game coordinator backends and such should have security holes too.
Shouldn't be different to any other server. I've not heard of it happening. However I've heard of many bugs in Overwatch which let you crash the server via in-game actions and kick everyone out.
In case there is a security hole in their Webserver (Apache, Nginx, HAProxy, etc) or their application (Wordpress, etc), that an attack cannot escalate privileges even further. Image someone manages to execute arbitrary PHP code, now they can gain gain root access, read private keys, etc. Things PHP should not have access to.
For spectre, the consequences of not applying those patches are not as bad but it wouldn’t surprise me if you could still do a lot of harm. PHP is somewhat sandboxes these days (by using chroot and what not) and meltdown/spectre could be used to escape the sandbox.
Yes you can? If password login is enabled, you could wait until someone SSHs in and read their decrypted password. You could read all kinds of secrets which will be very useful for gaining privileged information. Sure you can’t just become root but it is one way to get there.
For my server, the attacker could probably read mosh‘s secret key which is used to authenticate mosh clients. Mosh is a “mobile shell” which allows is resistent toward network changes (including IP changes), is low bandwidth and extremely quick on Eadge connection.
An attacker can probably come up with many solutions where reading arbitary memory can be used to gain root access. Granted the low bandwidth aspect of this attack might make it difficult but far from impossible.
If you can dump the memory holding security certificates, I'd suggest you probably can.
There will be processes outside a users control that run with elevated privileges, attacks that allow you to monitor the setup of these processes could prove risky to the overall security of a system.
Launch a chrooted daemon that uses Meltdown to read protected memory to discover when a local console user (presumably root/sudo) logs-in, then read the protected memory of the keyboard input buffer (keyboard interrupt handler? Console buffer?) to grab their password, then mail() it off to your email account and login as root via SSH from the comfort of your own home.
In principle you don't need to protect against spectre and meltdown if you can ensure that no untrusted code runs on a particular machine. How do you ensure that though? There are many many routes for that to happen, and it only has to happen once for it to completely invalidate your security model if you have no other levels of protection.
On a server you can run untrusted code by exploiting other vulnerabilities in server software such as buffer overruns. It's unrealistic to expect that all servers will have no such vulnerabilities in any of their software at any time. That's why operating systems implement many layers of protections to prevent those exploits from being elevated into even worse breaches that could leak sensitive data, provide local root access, and so on. One of the more significant protections that exists today is address space layout randomization, which makes it much more difficult for a buffer overflow exploit to result in immediate privilege escalation.
Meltdown/spectre make these sorts of issues much, much worse. They mean that absolute any vulnerability in any service could be exploited to read the entire contents of memory and deliver that to an attacker, including encryption keys, including passwords, and so forth. That's bad. You can essentially 100% guarantee that there will be some vulnerability in some service which enables a minor exploit, and you want to avoid having that minor issue become catastrophic through meltdown/spectre based attacks.
The replies are missing something from the original article.
They are running on the cloud, and Meltdown / spectre means that exploits can escape a VM. This means you don't just need to trust your VM, but also any VMs you are sharing the hardware with.
I suppose if a Cloud provider could ensure that all your VM instances run on the same host and no other VM is allowed there then the issue would be a bit mitigated.
Although this restriction on how the cloud provider is allowed to schedule your VMs probably would somewhat defeat the point of cloud hosting in the first place.
Out of curiosity, how many VM/container instances usually run on a physical host at any given time (for your typical cloud computing provider)?
AWS provides the ability to guarantee you don't share physical hardware with other customers with "EC2 Dedicated Instances" (https://aws.amazon.com/ec2/purchasing-options/dedicated-inst...). I'm not familiar enough with other cloud provider offerings to say if they do or do not have similar features.
With AWS, and I assume most other cloud computing providers, you can pay extra for you instances to run in a host without someone else's VMs. You probably should be doing this for any servers were you handle sensitive data, but it is a place were many will be cutting corners.
You don't really even have to pay extra. Just use the biggest instance since and you are guaranteed to be isolated because there is no room for anyone else.
Granted, that only works for workloads that are spread across enough small instances.
Only Spectre CVE-2017-5715 can escape a VM, not Meltdown.
Even then, you don't need to trust your co-tenants. Updating the host (which your provider has already done, or which is anyway out of your control) is both necessary and sufficient to protect from VM escapes.
I think he's saying that unpatched, a co-resident VM can compromise the host even if you are locally patched.
If the host is patched this doesn't matter (for VM escapes); however, for those on public clouds that will apply the host patching globally, you will be eating the performance degradation whether or not your workload cares about it. I don't think AWS et al will be offering a "run unpatched instance" option. Or maybe they will, this looks pretty devastating for some workloads; on the other hand, cloud providers could actually end up making a lot of money from this issue...
If, as you say, AMD is not affected by Meltdown unlike Intel, will this significantly change the server market? Excuse the pun, but would this make e.g. AMD EPYC a lot more attractive for such data centers?
Yes to the bounds check violation version, so far no to the BTB poisoning version. The bounds check version only works inside a single process, so is only relevant when you run untrusted code within the same process as some private data (such as JS in a web browser).
AMD claims that they believe that the way their branch predictor works effectively makes the BTB poisoning unusable, but there is no actual proof, and their statement regarding it was much more wishy-washy than they were with meltdown. (Which they specifically state they are completely immune to.)
Regarding branch predictor spoiling, if AMD doesn't update the branch predictor based off evaluation of branches the occurred along a speculative path, then the Spectre exploit (from my understanding that it trains the branch predictor using speculative code) won't be able to work.
Considering the performance impact, I wonder how console manufacturers are going to handle this (assuming that the processors they used are vulnerable to Spectre/Meltdown).
If they are going to block JavaScript, they need to do it for all domains, not just untrusted domains. For example, if I can MITM my own website activity (which I can, by having a device that sits between a router and a console), then I can change the JavaScript coming from trusted domains.
In other words, if I visit a site like Google or Facebook on an affected device, I can change the JavaScript that is run, and make it still appear like it came from a trusted domain.
You can be prevented from a successful MITM by enforcing HTTPS using HSTS and pinning the keys of trusted sites. That's not a viable solution for accessing the internet at large, but for a console that is only incidentally used for web browsing, it's absolutely an option.
There are dozens of ways to prevent this, viable because you fully control the hardware and software stack of the terminal (and only allow contacting servers of your choosing).
I imagine they are more concerned about game authors snooping keys or whatnot.
You can't just mitm when the JavaScript is served over https, unless you also can install your own certificates on the machine you're trying to hack. But when you have permissions to install new certificates you probably don't need the hack in the first place.
The Nintendo Switch generally only allows white-listed pages. So the attack vector is pretty small.
The other consoles, yes, should be possible. But I think not too many people use their consoles for browsing.
It will show any captive WiFi-portal however so perhaps watch out for that.
In addition they do have a store with indie games which seems to be more or less open to anyone (I assume there are some checks but I wouldn't trust them completely) much like Steam.
The requirements for getting onto the various platforms are pretty stringent. It's nowhere near as easy as getting onto Steam. You have to go through quite a rigorous certification process before you are allowed to ship your game on the online stores.
I wonder if you can use Spectre to steal signing keys from the device and jailbreak them. I guess they probably only have the public keys though, for verification.
The Switch only allows whitelisted pages, but sometimes loads them over HTTP (notably Puyo Puyo Tetris opens a web applet to http://sega.jp), so you can MITM yourself and load arbitrary content including your own JS.
Not unsigned code, but Xbox One runs apps from the Store. There should be a fair amount of checks before an app lands on the store, but you can never be sure.
I'm not sure about the PS4 and XBone but the Switch uses the Tegra X1, which has 8 ARM cores (4 Cortex-A53 cores, 4 Cortex-A57 cores), 4 of which (the Cortex-A57 cores) are vulnerable.
Vulnerable to Spectre, yes, like nearly all speculative execution processors. The PS4, XBone and Switch are very likely vulnerable to Spectre all the same but that will probably mostly affect their web browsers.
None of these is vulnerable to Meltdown. That is an Intel/Apple/ARM Cortex A75 only thing.
It affects WAY more ARM processors than A75. All the fast ARM cores are vulnerable to one of the two meltdown flaws and half are vulnerable to a subset of the third flaw (with A75 being completely vulnerable).
> All the fast ARM cores are vulnerable to one of the two meltdown flaws
Do you mean Spectre? Meltdown is the Variant 3 in that link.
> It affects WAY more ARM processors than A75. All the fast ARM cores ...
The ARM cores listed on that page are not all the ARM cores in the market (or running in the real-world). There are many more ARM cores than listen in that table, and the page says any ARM core not listed in that table is not vulnerable to any variant of Meltdown or Spectre.
> It could be still a problem though, considering many console games use all kinds of hacks to barely run at the target resolution/framerate.
If you had to slow down any of the consoles by 10% there would be tons of problems in the games. As you said, those games are highly tweaked to just barely run on those CPUs/GPUs.
I’d encourage anyway seeing super huge hits to make sure they are not using paravirt (particularly on Amazon). As that needs mitigation the the virt level the impact seems very large.
There's potential for a little rearchitecting to help, at least in the case of UDP:
NAME
sendmmsg - send multiple messages on a socket
SYNOPSIS
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <sys/socket.h>
int sendmmsg(int sockfd, struct mmsghdr *msgvec, unsigned int vlen,
unsigned int flags);
Yep, I was actually debating whether I'd mention sendmmsg/recvmmsg in my original post but I left it out. Definitely an option for UDP, but you're out of luck if your game server uses TCP (surprisingly many do) because you have to recv from each socket separately.
There may still be some options depending on the structure of your server and where the added CPU load is hurting most, for example, shunting IO to a thread/threadpool where futex() calls (if necessary) only occur for every N IO requests, rather than pay the syscall price for every IO on the main thread. But that might introduce new latency/ordering problems all of its own
Latency is more important than reliability for online gaming because the world state instantly gets stale. Instead of retransmit you want latest snapshot. I'd be curious to find out in what online games that is not the case.
Latency isn’t an issue today what so ever, it’s not like UDP also has a magical lower latency it had in the past because of smaller packets and slower computers but today?
Online games work today with fixed ticks and polling usually at half of full frame rate which means that the server updates and polls the client 30 or 60 times a second or any other even multiplier of the expected synced frame rate.
I'm calling bullshit. TCP has to retransmit lost packets whereas UDP can keep on going without waiting for transmission. Ephemeral data like controller input or past game states can be ignored because that time has passed, while TCP is still trying its best to get the packets there in order and reliably.
Latency is still absolutely an issue. I play games from Japan with my friends in the states and I often have a ping of 140 ms or so. That is latency, and properly implemented games (Rocket League, for example) will deal with it using UDP among other techniques.
Slower paced games can still use TCP though because latency issues are less sensitive.
>This fixed and predictable rate pretty much means that UDP is near pointless and games that still don’t have a predictable tick rate and use UDP tend to be a rubber banding lag fest.
You're wrong. Head of line blocking is a real thing that happens very often in TCP.
Edit: parent poster removed that part of the comment between me reading and submitting a reply
If it's a latency sensitive (like a twitch FPS), UDP is the way to go. Having up to date data is more important than having all the data.
If it's a synchronized game (like a turn based game or an RTS where all clients run at the same logic framerate), or not latency sensitive (an MMO like WOW), TCP is fine and probably easier.
That only works if messages are independent of answers received and are all known at the same point in time. In most games this typically would not be the case, you'd use a message to cram as much state change into it as is known to keep the game moving fluidly. Packing more than one such message together would serve no purpose.
I'd have presumed otherwise, but I'm not sure if you're understanding the API correctly.. it's not about sending multiple messages to the same destination, but to multiple destinations in a single call. The msg_hdr struct has room for specifying the target address.
From userspace' perspective, even if the same data isn't being broadcast at every client, just building up a big array (perhaps while looping over the input from recvmmsg()!) and spitting it out once would have the same semantics as just calling sendmsg() immediately on each, etc
Yes, I understand the API correctly. Having implemented it once I think I have the basics down ;) But that said I was assuming that this would be in the context of multiple UDP messages sent from a game client to a game server.
The bottleneck generally isn't at the client side for games, The server has much more network traffic to handle. So even if this performance fix only works server-side that might be enough.
It's pretty common for FPS server game loops to read all the network packets, update player state, run one tick of game logic and physics for all users in a single game, and then send out updates to everyone.
Yup, this is pretty much on the mark. Game updates are sent at a fixed rate(usually 10-20Hz) so you could batch up all outgoing game state into a single dispatch. It'd probably take a bit of work re-architect the output loop(pre-allocate for max number of players to avoid per-frame allocations) but should be reasonably straightforward to implement in most engines.
If they are using socket API for UDP, performance is not critical for them. Otherwise porting UDP servers to DPDK/netmap is not rocket science and gets you like an order of magnitude better performance.
A shooter running at 30Hz (high side for free developer sponsored servers) with even 100 players is only going to be processing at most 10-15K packets per second, and that’s assuming 5 packets per tick per player. Server update rates are usually lower than the internal physics and game logic tick rate as well, so it’s doubtful to even be that high.
These aren’t stats of the gameplay servers, these are the backend servers that handle matchmaking, player stats, inventories and progression.
You could use it, but you would probably have to reachitect the server to use green threads to avoid the overhead. Frequent syscall sends are done with games to keep the latency as low as possible. Any batching would increase delay
Since they describe this as log-in issues, should we expect that to be a server using lots of UDP? Or do you think that the log-in service is hemmed in by the load on the game servers?
No, login in Fortnite is over 443 and it uses normal HTTPS.
It also uses HTTPs to load a lot of other data such as server data, friends list, chat etc.
Unreal Engine comes with a version of chromium built in which is used for many in game things like social tabs, news, and in game purchases these all work over HTTP/S.
Right, that's what I was guessing and am not sure why I was downvoted when everyone is confirming that contrarian's post doesn't really make sense since UDP isn't very relevant here.
Are you sure? I would be extremely surprised if it wouldn't use UDP.
For things like getting stats, probably from a HTTP endpoint, sure, but for gameplay? The lag would be very bad, no? Lose a packet and everything is slowed down
and according to wireshark it's used heavily when in a game, so I assume that's the gameplay protocol. Also when I left my game (but stayed in the lobby), immediately the port 61879 stopped listening.
I'm not sure about UE4, but previous versions of the unreal engine used UDP for replication and RPC.
Interesting. I know World of Warcraft uses TCP, but I imagine it's less "real-time" than shooters (ie: no hitscan) so a few dropped packets wouldn't have a huge impact (ie. if you're standing casting a spell for 2sec, the game can recover the lag easily). Didn't know some shooters used TCP
Nope, it might use TCP for negotiation of things that aren't time sensitive but it uses UDP for replication[1] as has pretty much every Unreal or Quake based engine since they were first developed.
I've worked on a variety of engines which were either UE or Quake based. All of them use UDP for temporal game state updates to avoid head of line blocking[2][3].
You can basically have bigger UDP packets, that will get fragmented and may not make it to their end.
As far as I know, in Linux there is no support for multi-packet UDP vectored I/O. I wonder if it would be possible to "simulate" that with a raw socket....
Gameservers don't handle that many packets because they're limited in the number of player they host. A game a 60hz with 64 players will only receive 3840 packets/sec.
>We wanted to provide a bit more context for the most recent login issues and service instability. All of our cloud services are affected by updates required to mitigate the Meltdown vulnerability. We heavily rely on cloud services to run our back-end and we may experience further service issues due to ongoing updates.
So they are saying "you can't log in because we're too cheap to rent more servers now that this patch has increased CPU usage"
Only if the algorithm can be split across multiple machines easily.
Sometimes people assume that you can use local shared memory or something between threads in order to synchronize state. You figure out how many individuals can be on a server at once and then ensure that you can handle that load on a specific machine.
I've seen this type of stuff for game state before because they need to keep everyone in a specific game domain (level or city depending on the type of game) synchronized and be nearly real-time. It can be hard to pull this across different machines without introducing significantly latency, redis or DBs is slow for a FPS shootter.
Not saying this is the case but I can see it could be something like that.
Yup. When I was in a team working on distributed game servers, this forced us to shard games instead of distributing individual games.
Terminology wise, if a game session was fully distributed across multiple instances, each server could accept traffic for this game session. Think elasticsearch - each node can answer searches for any index in the cluster.
If you shard your games, you just put all games with an even ID on box 1 and all games with an odd ID on box 2. If box 1 dies, all games on that box disappear. And then you usually end up with a lobby server as an initial connection and redirect to the actual game server in the background.
This is a very simple architecture. It's easy to develop for this architecture, because you don't need to worry about complex clustering issues - exactly 1 client talks to exactly 1 server and it doesn't matter if there's 50 other servers answering other clients.
This is also very nice to scale. Most server side code for games tends to have very predictable resource consumption, because it's running a pretty predictable simulation loop on a pretty predictable and bounded data set. Especially from this perspective, I can see why the Epic guys are bugged. It's not pretty to put a factor of 2 into those calculations.
Renting beefier servers may also be an option if the performance impact scales linearly with pre-patch performance. However it's not clear in this case how that works.
Depends if the bottleneck is a single CPU or whether it is a single machine. Those beefier machines are usually slower per CPU.
I've seen some crazy game server designs before in the quest to have fast response times. But UE4 is pretty professional so even if they are tied to a single machine, they probably are not tied to a single core for some of their critical algorithms I would hope.
> They are benchmarks, just not microbenchmarks. Actual application performance is the gold standard for benchmarks.
This depends on your intended purpose of the benchmark. If you want to evaluate performance of a whole system, sure. But one often does benchmarks to find parts of the system that are critical to performance so that you can do further optimizations or do change in the software architecture to avoid these critical performance hotsports. For this case other benchmarks that allow an easy fine-grained interpretation of the data are surely better.
Our Node.js, MongoDB, Python servers all with significant network traffic didn't have any measurable impact after KPTI patches on Amazon Linux on T2.medium(burst), M4.large, T2.large(burst) respectively.
IIRC, XEN said that 32bit HVM VMs are not affected by Meltdown, and so probably don't get impacted by AWS' patches. They still require Linux kernel updates to protect the kernel space, so changes might still be seen there.
Yes, but XEN said 64bit PV aren't affected either because they already run in KPTI like environment. So I assume 64bit HVM like ours aren't impacted for our work load.
64-bit PV is unaffected and won't suffer a performance penalty (more precisely, it was already suffering it!!!), hence my original question, but a 64-bit PV guest can use Meltdown to attack the hypervisor. The fix is to update Xen, though I am not sure if fixes are already publicly available.
Actually, there is not any measurable difference. Our architecture is completely state-less and when compared with equivalent load there's no difference at all. I guess, default network throughput bottle-neck itself is higher than the random memory cache bottle-neck in my case. There was no impact on the latency either.
Fearing automatic update by AWS resulting in performance issue, we rushed to update all our serves and gladly there weren't any performance impact.
I don't think not every one has been fortunate, there's still lot of variables reg performance impact of KPTI. This user here using PHP server mentioning 50% performance - https://twitter.com/timgostony/status/948682862844248065 and of-course OP issue is being covered here.
Thanks - this is what's worrying me: the result of patching seems to be all over the map, and it's somewhat hard to predict. My biggest worry is SQL Server, so our plan is to fix a performance issue we isolated recently before we patch - I'd like that system running optimally before we take the hit.
Judging from our MongoDB instance, I'm optimistic that your SQL server wouldn't face much impact either. Even Redhat tests put the risk at modest.
But for something like Redis, the case could be different.
If you are on a cloud provider, do check that automatic update is not scheduled; if so it's better to update manually to follow up with mitigation if necessary.
This is almost entirely subjective and workload dependent.
We know that the biggest performance factor in both of these patches
(spectre, meltdown) is the kernel boundary.
So saying something along the lines of "Redis is more affected than Mongo" is.. odd.
MongoDB (I HOPE) will save files more frequently than redis, since redis is primarily in in-memory DB
with some lazy IO access (depending on configuration).
So if you have a lot of small network calls to redis (which, is a kernel boundary event.)
then it will be more impacted than a mongodb which takes only a few very large requests.
My gut is that the opposite would be true for the same workloads on both, MongoDB needs its files to return data, redis doesn't.
Even the vfs-cache is in kernel "space", so if your entire data sat in vfs cache it wouldn't help.
But that's my gut, and your exact usage may not mirror the 1:1 of this scenario.
I would agree that so far there doesn't appear to be any measurable difference for a typical web application. We have some heavy postgres data crunching jobs that might be affected but based on the examination I've done so far I'm inclined to say those weren't affected either.
Exactly, in my company's SaaS monitoring, I excluded db/external calls, and I saw a 1-2ms increase (on-top of 15ms), so yeah that's a 10% increase, but with external calls make the overall transaction 50+ms, so 1-2ms is trivial.
Similarly, the CPU on our Azure cloud box holding a half dozen dockerized Java apps I can't really see went up at all. Hovering around 10% CPU usage before and after patch.
More specifically, the meltdown patch primarily affect memory bound applications, not necessarily computations, although those usually overlap for CPU bound applications.
It's not necessarily helpful, since "Python" doesn't indicate 'single-threaded' or 'multi-threaded', and whether or not your threading is constrained on outside resources (MongoDB) or purely internal worker calculations, and etc. But it's interesting!
We're using KVM on OpenStack. We use a number of the services and PostgreSQL is our main database platform with our main backend platform being C++ based.
I have performance trending going back 1 year and honestly other than Postgres, the performance #s are like rounding errors. PostgreSQL is getting about a 7% hit on our databases that don't fit in memory.
We also run SAP and again we're not seeing anything. The SAP Database runs on POWER Servers, so obviously unaffected at least by Meltdown.
In our case, there was a major kernel version change, so we're just at the tail end of QC before I can release things to production to begin the rolling restarts tomorrow morning.
I know they don't have to share the details, but the "patched" part is not really clear. Did they update to a new image / more recent kernel / anything else? Much like the redis post linked in HN before, we don't know if the impact is because of the "pti turned off/on" change, or are there more moving parts involved.
Since they're using default provisioned EC2 instances, it's likely that the developers don't necessarily even fully understand their performance degradation. They just expect the service that they pay for to work properly.
It's true, but it's not what I meant. They wrote "after a host was patched". This is ambiguous. So they mean the host as in instance, or host as in AWS host machine? Did they just reboot it get on the new/updated VM host, or did they rebuild to include the PTI fixes as well. Did they upgrade anything, or did everything else stay on the same version.
From what I can tell, Amazon isn’t giving people deep technical information. They’ll just send you an email telling you which instances are being forced to restart and when.
Is this a Program -> CPU interaction that is slowing things down, or is this a CPU -> network interaction.
I'm wondering if there are classes of network drivers that are having a much larger effect in performance. Network cards these days can do many things to improve performance like TCP/UDP offloading and because of that their drivers are very complex and I'm going to assume that there well be Meltdown fallout because of this.
The Meltdown attack requires an attacker to have a piece of code executed on your server. Epic's servers are used for login, where people send you data, and for game logic, where people also just send you data like "player x moved his avatar here, player y shoots etc". If all the server does is execute the code which Epic wrote themselves and already trust, why would it need to apply the Meltdown patch?
Security is about risk mitigation. You cannot derisk entirely, so you make tradeoffs. Without knowing all of the parameters, it’s disingenuous to say it’s a horrible approach to security.
The most secure computer is powered off, enclosed in concrete 6 feet below the surface of the earth. It is not very useful though.
Very much this. In the real world, basically no one will make their house entirely bulletproof and never go outside just because of the possibility that someone could shoot them. Everyone knows that could happen, but is fine with the risk.
"100% secure" is unattainable, and to attempt to achieve it to the exclusion of all other concerns is probably a very bad idea.
Reminds me of Battlestar Galactica. The humans were so (wisely) fearful of the Cylons that their ship computers were not networked in anyway. For their risk/reward trade-off curve, the benefit from having computer networks were not worth the risk of the Cylons being able to compromise the entire ship. All communication was either verbal or done via fax printed to paper and so had to go through human intermediaries.
if you do not apply spectre/meltdown migtation on non virtualized hardware which only you control you are safe. of course a malicious process could do harm without them, but a malicious process probably can be as harmful, even if you have the patches applied.
basically we need these patches since on the internet a ton of stuff lives on virtualized hardware and there you can actually be harmful to other vms or even the host itself.
basically spectre/meltdown needs to be applied in hosted environments and on clients.
on everything else it's way less scary than on those two environments.
Then you didn't listen to DJB like 15 years ago. Moreso, your processes shouldn't be running under the same username. Principle of least user authorization people.
please just install a linux distribution and run ps aux. a ton of stuff just runs as root. either because there is no other way or because it's a bad default.
Yeah it’s your job as a Unix dev or admin to not stick with crummy but convenient defaults, kid. Nothing exposed to the web or touching anything exposed to the web should be running as root. Each thing should be running under its own, very limited, user and group. If something gets hacked, you want it to be as least privileged as possible.
We know how Unix works. We’re telling you because you clearly are out of your depth. If you don’t understand that you have no business maintaining anything on the internet.
Some things are just not applicable. There's no point adding another padlock to the outside wall of Fort Knox. If untrusted users don't send instructions to the same physical CPU that you're using, Meltdown/Spectre are not relevant.
I also think that we need more information before we really see the long-term performance impact here. First, the patches will likely be optimized over time. Second, I am not sure that PTI needs to be run within guest kernels if the host already has the mitigation (effectively creating a double-hit from PTI in both the guest and host kernel), or that even if PTI is needed in both kernels now, that it will be needed long-term. We also need to figure out what type of Spectre mitigations need to be performed within the guest if the host system has the new microcode that selectively disables branch prediction.
The short answer is that it's going to take another 4-6 weeks before even the kernel devs have a firm grasp of what's useful/necessary to mitigate this and for things to settle into place, as Greg K-H blogged about earlier. At that point, we should have a more realistic perspective on the long-term performance impact.
In the past if you had a RCE that only got user level access and that user was very locked down there was only a limited amount of damage they could have done. Now with meltdown, every RCE is a full information loss at system level exploit.
This is the correct, non-snarky answer. It's a known privilege escalation attack, turns a bad day (someone got a non-root account on my server) into a worse one (root account).
That wasn't snarky because it was only mildly critical. Since it was mild, it also wasn't cutting. Lastly, it was not derogatory or mocking in an indirect way and so it was not snide either.
I disagree. Aside from the sarcastic "buffer overflow" comment, his comment was far more snide than any of the replies he seemed to be complaining about.
Since you take a different view though, as I asked in my post, if it's not "snarky" then what was it?
> if what he said wasn't itself "snarky", what was it?
A complaint that Hacker News is a less agreeable place than it used to be. Whether that's true or not, don't you think you might find a more effective way of counter-arguing than by being as disagreeable as you can possibly manage without outright swearing in your responses?
I get that you're sure that you're right. That's great! That's a source of strong motivation to pursue an argument. But it suffices really nothing just to say "I'm right", and much less just to say "you're wrong". If you're interested in convincing people that the argument you're advancing is a more accurate model of reality than those with which it's contending, then you have to do just that - convince people, by answering their points with compelling counterarguments of your own.
And if you want to shape the discussion such that it's possible for you to convince anyone of anything, then you must above all treat your interlocutors with impeccable respect, no matter how wrong you may think - or know! - they are. To do otherwise only hardens them against whatever you may have to say, and the outcome you thus produce is even less favorable than that of simply declining to engage in the first place. Conversely, treating those around you with respect tends very strongly to elicit respect toward you from them, in turn. That's how you earn the fair hearing you need to make arguments that might convince, and thus give yourself a place to start.
Perhaps that sounds like a normative, rather than a positive, statement. I've seen people react badly in the past, and sling accusations of "tone policing" and all manner of other offenses against some apparently very abstract conception of discursive mores which may hold sway in occasional quarters but is very far from predominating. I'm not telling you how you should behave in the sort of discourse where "tone policing" is a meaningful phrase - indeed I'm not telling you how you should behave at all. What I'm telling you is that, whatever places you may have been where you found unstinting contempt for your interlocutors to serve your turn, none of those places is here.
That's why hectoring people, as I have observed you fairly consistently to do here, doesn't convince people of your points. At most it convinces them to stop trying to talk with you. Perhaps that's the result you're trying to produce, in which case keep doing what you're doing! But if you're not trying to develop for yourself a reputation here of being someone best avoided, then you may wish to revise your style of argumentation somewhat.
Now before you get too cross with me saying this as I have, consider: We spoke briefly the other day on the subject of complaining about downvotes and why it is not helpful. You seem genuinely upset to accrue them so easily, and I treated that dismay perhaps more cavalierly than was justified. I'm sorry for that; to try to make up for it, I thought I'd explain why it is they keep happening and what you can do to change that.
You clearly expect respectful engagement from others, and there's nothing wrong with that. But when you refuse to engage respectfully with others, who quite reasonably expect the same, it isn't really a surprise when people choose to answer that disrespect by downvoting and moving on, rather than by eliciting your further contempt through an attempt to engage with you.
The good news is that you have in your hands the power to change this state of affairs! The style of your discourse is entirely within your control. Show respect to those around you, and you'll find those around you show you respect in return. You can totally do that! You can totally make that happen. I hope you'll choose to do so.
> A complaint that Hacker News is a less agreeable place than it used to be.
Which makes the place no more agreeable.
> Whether that's true or not... <huge snip>
This is amazing.
I ask a question, and the only "answer" is simply an excuse to segue into a weird, patronizing, passive-aggressive wall of text that has zero to do with what I actually posted, and is nothing but a personal attack dressed up as a lecture on politeness. Amazing.
I guess you don't reach 10k karma on a throwaway by not having a gallery to play to.
If you're talking about a game server like this, though, where you don't have sandboxes running client-originated code, just processes running your own software, any remote code execution vulnerability is already going to be in a position to do incredible damage. There are almost certainly going to be other privilege escalations available to you once you've buffer-overflowed or whatever to dupe the server into running code you control in process - and honestly, once you own the main process on a server like this you can probably do enough damage from in there without even needing to escalate.
I think the frightening thing about meltdown/spectre is the ability for code to escape otherwise solid sandboxes - including VMs, as well as javascript runtimes. It's a change in the threat landscape for systems which expect to run remote code but believe they can do so safely.
> code executed on your server. Epic's servers are used for login, where people send you data, and for game logic, where people also just send you data like "player x moved his avatar here, player y shoots etc".
Have you ever heard of the term 'buffer overflow'?
I assume their servers will he running untrusted Unreal Engine games, maybe even community-developed mods, some of which may have attacks in the script of the game.
Perhaps they don't, but eventually the patches become part of the mainline kernels, and then they will run with the patch applied.
Regardless of what some of my customers believe patches are rarely hand picked and then applied. All security patches are applied, always, kernel patches even more so. They are rolled out with the tools provided by the operating system, be it Windows, Redhat, Ubuntu, OpenBSD, Solaris or whatever, and they don't make assumptions about your use case, they just apply all security patches.
Your argument also applied to other circumstances where the computer only runs trusted code. For example a fully FLOSS software stack on home computer with JavaScript disabled in the browser.
Whilst I sympathise with your frustration, at least with Intel, I can't help feeling like you might be storing up bigger problems for yourself with this course of action.
I get what you are saying but right now there are no known exploits and with so much patching happening will there ever? These things are always over blown in the media and the reality is very little damage happens to the average user.
It’s servers perhaps that are most at risk.
Also I don’t do much on my Windows, mostly gaming, I run an iMac and boot into it only for certain tasks, so it’s extremely unlikely I will ever have an issue.
Any performance hit, even if small is just not worth updating for to me.
What do you mean "no known exploits"? The authors of the paper have an exploit that reads arbitrary system memory from a browser. And even after the meltdown patches, spectre "fixes" we have seen are only partial mitigation, and still potentially allow reading of passwords and third party cookies. But I guess if you want to wait til it's too late...
Maybe off-topic: Is formal verification viable anywhere in CPU logic design? Also, could any existing "CPU static analyzers" have caught the issue that caused Meltdown?
Edit: It looks like the answer to the first is a definite yes.
You can only verify properties you've thought of, and no-one conceived of this particular 'feature' causing issues like this until now. So I don't think formal verification would have helped: if anyone was in a position to realise the issue was worth verifying, they'd have been able to raise it without formal verification too.
Presumably of isolation features. I'm sure everyone knew what side channel attacks were and knew that formal verification could've helped find bugs in isolation. They just chose not to do it, because it's only important for high-assurance, not for your regular insecure linux/bsd/windows. And I'm sure they are not going to bother with verification even after everything.
>You can only verify properties you've thought of, and no-one conceived of this particular 'feature' causing issues like this until now.
I've read for years about supposed insecurities with branch prediction -- it just wasn't shown practically. To say that it wasn't conceived of is a little off.
Spitballing, but you'd have to prove that either no side channels exist, or that the information leaked by side channel(s) is indistinguishable from random noise, which I believe is an area of ongoing research.
CPU design is in a large part an exercise in formal verification. On the other hand it does not help you much when the bug in question is in the specification that you verify against in the first place or, as is almost certainly this case, is based on some behavior that is not specified by the specification at all.
In my mind Meltdown is a failure of implementation due to the choice to violate the logical design. If the logic of your design says "do not access ringX memory if you are not executing in ringX" then don't do it. Intel instead decided that they would access it but thought they could hide the results of that access from the process by never providing the read results. The problem is that actions always leave evidence in the physical world, simply assuming that no one could detect those actions (which was true at the time) is not the same as not committing the action. The benefit for Intel in violating the logical design rule was speed (and probably a reduction in gates) and the optimistic view that there were no consequences to the violation. Unfortunately hubris invited Nemesis to the party, as usual.
Spectre is a different problem in that the very design of speculative execution contains the seeds of its flaws so there really is no way to implement it in such a way as to avoid the intra-process snooping although the inter-process exploit may be avoidable via implementation.
Yea I hated that too. They put up a graph but with no key! Arg! The article said they updated one host, so I'm assuming the spike is from that one VM that got patched.
We wanted to provide a bit more context for the most recent login issues and service instability. All of our cloud services are affected by updates required to mitigate the Meltdown vulnerability. We heavily rely on cloud services to run our back-end and we may experience further service issues due to ongoing updates.
Here is a link to an article[1] which describes the issue in depth.
The following chart shows the significant impact on CPU usage of one of our back-end services after a host was patched to address the Meltdown vulnerability.
[the image]
Unexpected issues may occur with our services over the next week as the cloud services we use are updated. We are working with our cloud service providers to prevent further issues and will do everything we can to mitigate and resolve any issues that arise as quickly as possible. Thank you all for understanding. Follow our twitter @FortniteGame for any future updates regarding this issue.
Epic suggests following security best practices by always staying up to date with latest patches.
General Recommendations for Computer Security[2]
We will continue to update this thread with similar information as it comes to us.
This surprised me a lot. I thought 30-50% would be worst case and then with additive effects from both Spectre and Meltdown fixes. Not sure how it can be this bad. I imagine it could get even worse if you are running in a virtualized environment on top where the server is affected in turn, but I figure thst wouldn’t show in a CPU graph like this..
The kernel address space isolation makes syscalls much more expensive than they were. A toy program that just repeatedly calls the cheapest syscall in the kernel would lose much more than half it's speed, but that's not what's reported because no-one actually needs to run that workload. Epic seems to have been particularly unlucky in how KPTI impacts them.
(Because making NUMA aware C++ code is hard, and AMD Epyc is a single socket on a server with 4 very closely knit NUMA zones so non-NUMA code will run better on that vs Intel)
But unfortunately there's no comoddity server from HP/Dell available yet. But I hear one is on the way on the Dell side.
HP has paper launched the DL385 G10 with EPYC, I'm not sure if it's fully available through sales channels or not. Supermicro also has had EPYC systems available for a while - though that won't do you any good at all if you need a big name OEM for "nobody got fired for buying Cisco" reasons.
Doesn't stop people from filing one. They already have, in fact. I agree that it shouldn't get anywhere, but I'm not as sanguine about whether or not it actually will.
It doesn't have to be gross incompetence. If you pay for something and it doesn't deliver as promised, you may be entitled to a partial or full refund.
If you know that your product is flawed, do you keep selling it or will you pull it from the shelves?
To this date I can still buy broken Intel CPUs....
They knew about the flaw in June. Yet they still kept selling Coffee Lake CPUs.
If my 8700k wouldn't still be significantly faster than Amd Ryzen (or I wouldn't have to also return the MB), I would have switched to an Amd in a heart beat.
To be fair to Intel, mitigating this type of issue isn't at all trivial. If it were, it could have been fixed in a microcode update. You can't rush a chip design for something as complicated as x86_64. There are long multi-year development cycles and tons of regression tests.
With this they need to add even more tests before they can start on attacking the issues with the design.
> To be fair to Intel, mitigating this type of issue isn't at all trivial.
It's understandable that they may not have been able to mitigate it in the 8th gen CPUs in just a few months, but they also put those CPUs out on the market, advertised them, and sold them, all the while they knew of their design flaw, without saying anything about it to the unsuspecting customer.
Speculatively loading data across a protection boundary which is what happens in Meltdown can be argued to be incompetence or sneakiness. It certainly helps with benchmarks.
Moreover it would be hard to say "everyone is doing it" because it seems so far besides the latest ARM processors, most of the other CPU architectures don't do it.
I’d go farther, anybody who is concerned with application performance may soon see/smell their hair smoldering.
We write a lot of code that needs to run as fast as possible (processing, post-processing, generating real-time weather/satellite data) and I’m concerned about time windows and whether we’ll be able to meet requirements.
Meltdown is aptly named, as that is what's gonna happen to the global corporate services market, "efficiently" hosted in the cloud and scraping for profitability.
I suppose these guys just don't apply the patch. They are not running in a public cloud with potentially hostile neighboring VMs; they run their own trusted code. They might choose to tighten perimeter security instead, in the short term.
Data processing where number crunching is the key problem may not be affected much. Data processing where fetching and assembling the data from many sources is likely to see larger slowdowns. Pure communications workloads are also likely to see larger slowdowns.
For pure computational cases Meltdown vulnerability remediation might not impact you as much. Meltdown fixes would affect applications that do a lot of syscalls - lots of socket IO, IPC stuff, small and frequent disk IO ops etc.
You'll probably see a wide variety of benchmark results anything from no impact to things like above. So just make sure you measure carefully with your own workload.
Intel's PR dept is in overdrive, but the truth about this vulnerability is that it's essentially worst-case.
It really only affects workloads where high performance is important. The average user might not see an impact but if you need fast IO God help you. The solution is to 'make less syscalls' but the problem is that syscalls have always been slow and the people making a lot of them are only doing so because they absolutely have to
Isn't that like saying that it turns out this car we sold can only do 150, not 200. But it's okay because most of you never drive above 80 anyways.
You're right. Most people who just surf the Netflix and download the YouTubes will not notice. But it's still a form of fraud, even to those who never max out CPU. I think fraud is a strong word knowing that this wasn't intentional. But they sold a lesser product than they advertised and need to make customers whole. Otherwise it pretty much is fraud.
Is that remotely feasible? You're basically suggesting that Intel needs to refund or replace every PC and server CPU they've sold in the last five years.
The fdiv recall cost Intel almost half a billion dollars, and that was for a small subset of processors that most people didn't replace. Intel has a lot of assets, but replacing five years worth of CPUs might actually bankrupt them.
I deleted that because I thought it was too flamey after reading it again, but yeah refunding everyone is unrealistic.
Maybe just average out the performance impact across large cloud providers and offer that as a percentage? It probably wouldn't be hard for a company like Google to crunch metrics before/after the update and give a number for how much their performance has been affected
Yeah, I think it's clear that some amount of risk has to be absorbed by the public. It sucks, but it's in our best interest to keep pushing computing forward. And we should keep in mind that it's also not like these attacks are obvious. They took security researchers four years to find.
I think the important thing here is precedent, rather than making customers whole. The question then is what, if anything, could Intel have done to anticipate and prevent this, and how do we incentivize them to take those measures in the future?
It's also possible that there's nothing to change here. There's always going to be some risk, and trying to force that risk closer and closer to zero will at some point not be worth the tradeoff. As bad as this is, if the caution necessary to prevent it would have resulted in processors being half as fast as they are today anyway, there wouldn't be any point.
I think people are underestimating how bad webapps are. I do... a lot of computing and by far the most CPU intensive applications on my computer are webviews. Either Gmail in chrome or slack. I'm actually going to be upgrading my laptop soon because slack+3 organizations crushes my laptop.
Do you think browser vendors have been pushing JIT research forward, investing in webasm, and building things like servo because websites are so fast and light on CPU?
I wonder if we'll start seeing more userspace storage drivers because of this.
If anything, for big shops like this one, I wouldn't be surprised if we saw a move away from hosted providers (the "cloud" .. god I still hate it when people use that word), and return to co-located setups.
At least for people who need immediately performance needs, AMD might make a lot of short term sales right now.
I've been too focused on the tech that I forgot about the legal part of this. Is this why they've all given wishy-washy responses that take no responsibility and seemingly don't even admit there's a problem? Sounds like they expect massive legal recourse and have no real choice but to listen to their lawyers who are telling them to admit no guilt.
Yes. They’re facing potentially ruinous class actions from individuals, and the legal fury of uncounted corporations which are going to suffer. It’s hard to imagine Intel dying, but if the impact on performance is so enormous, it could happen. So yes, they are in ass-covering-survival-mode. Having said that, I wouldn’t expect Intel to go down in legal flames, but the endless PR dripping is going to have to become a flood for a long time to recover.
Exactly why. Even if the performance impact is 3%, that means the world has lost maybe 2% of its CPU power overnight. That's an utterly massive amount of hardware, many billions of dollars, maybe a trillian.
It's not just the cost of the processors, but everything that contains them. Truly 'fixing' all the affected products would bankrupt a county but we should get a refund for the CPU's at the least
If no one saw it coming, why is it all over papers which suddenly appeared in the last few days here? It seems that in the world of high assurance security, this was largely assumed, but not provable due to propriety walls.
If you bought a car that needed to go 60mph and a subsequent update to the car from the manufacturer for safety meant it could only go 30mph, there would be legal consequences.
Of course you can't predict every contingency, but some of them you pay for.
That's new for a company, and it's not indicated by their PR spin right now.
What I think happened : a company produced a product with a problem. Probably not out of malice, but ignorance.
One of two things happened after, which can kill the 'good faith' argument ; the problem was found internally and hushed, or the problem was found externally and minimized to reduce financial burden arising from fixing the problem and the PR related.
We have no way of knowing how well it was known about internally, but we can all see the PR going on from Intel right now, and I hope i'm not the only one who reads into those press releases to establish intent.
Well this type of attack has been theoretical for years. The Project Zero referenced some papers from the mid-2000s that talked about it. But the implementation, even today, isn't exactly trivial.
Modern processors are insanely complex systems. Branch prediction, out of order execution, hardware virtual memory management, hardware virtualization, etc. Not to mention that these are side-channel attacks. It's not a direct vulnerability, it requires executing some code and measuring timing very precisely; similar to and oscilloscope and a very expensive safe.
Of course Intel is going to be spinning this however they can for damage control. That's what PR departments do. I still doubt engineers at Intel really thought this attack was plausible, or else they wouldn't have been engineering chips this way for the past decade.
And until we align their market incentives properly, silicon vendors are going to continue to ignore this fact when it comes to verification. Intel is especially bad here; they’ve had an unreasonable number of hardware bugs in recent years.
Intel got a report about this vulnerability from Google in July. Intel's CEO decided to sell stock in November, scheduling the sale in October. Intel also decided to pull the Coffee Lake desktop launch in from early 2018 all the way back to the start of Q4 2017 to try stop the momentum AMD was building with Ryzen, while knowing this vulnerability was present - they're still planning on launching Cascade Lake the first half of this year and god knows if meltdown will be fixed in it or not.
Right now, I think the problem started out of ignorance - but they have abused Google's policy of responsible disclosure to hide the flaw as long as they could and take advantage of their market position while the unknowing public kept buying their products. Now they are pulling the four D's of propaganda in their PR statements all while we are seeing huge performance deltas in graphs from Epic and more?
This is straight up deceptive, I'm glad I switched back to AMD with my new gaming rig and I already have plans in the works to purchase multiple EPYC servers with our datacenter move starting next month.
> Things happen. It's impossible to predict every contingency.
Normally I'd agree but not in this case. Notice that besides latest ARM it seems no other architecture is susceptible to Meltdown - s390x, SPARC, POWER, AMD etc.
Speculatively loading and executing code across protection boundary is something someone should have thought twice about. Doesn't mean other vendors knew or had PoC examples, but they could have had an instinctive hunch.
I've given this example before, say I have some sensitive data on a server. I could install a bunch of services and API endpoints on it, to access it faster, inspect it, extract it various formats etc. Or I could decide to lock it down and just install only the minimum number of needed things, locked everything down. Doesn't mean I knew all those additional APIs or services had vulnerabilities, but it's just a good practice.
If the server gets hacked and someone looks back, I think they would be justified asking "What the hell were you thinking installing all that crap you didn't need on it".
According to their patch notes Apple's ARM cores are also vulnerable. And it's not just the latest ARM, the A15, A57, and A72 are vulnerable to a less severe variant of Meltdown.
I know almost nothing about law but I would assume a large company would be insured against contingencies like this.
I remember reading something about how rich people can get 'everything else's insurance that's basically applicable to anything bad that could happen to them. Maybe its the same with companies?
You can insure almost anything, assuming you’re willing to pay the premium, and that the insurers are satisfied with their due diligence. BUT... that insurance company will look for any technicality, any little way that you failed the extensive duty to protect your insured assets, and try not to pay out.
So really, in this case insurance is just another word for “more lawyers to drag through the courts for people years, with uncertain results.”
I think CPU designers are getting too much flak for this. The side effects of speculative execution has been around ever since their introduction. A compounding problem is how abstract software has become, it has made assumptions on the underlying execution model without a deeper understanding of its sublteties (sandboxing, containers, etc trying to guarantee process isolation). It’s the Law of Leaky Abstractions [0] at work. What is alarming is how long it has taken for us to collectively realize this is a problem. Hats off to the folks for uncovering this blind spot to us.
But back then, Intel could not deny that the FDIV bug was an actual bug in the CPU. Intel's press release on Meltdown is carefully worded to avoid admitting it's a bug in the CPU.
But since the USA have the highest lawyer-per-citizen ratio in the world, it would be very interesting to see what happened if, say, Google sued Intel because from one day to the next, Google's electricity bills rise steeply, their servers crash, because the AC cannot handle the extra heat, ... , and Google demanded Intel pay for their damages.
Those who have looked at the patches in detail: does it treat older generation CPUs differently than 7th gen? The Paper authors who wrote the KAISER patch expected much worse performance on older CPUs due to implementation issues...
Unfortunately, Epic Games don't provide any details about their CPUs AFAICT.
if people would still use good old owned hardware without any virtualizsation, spectre and meltdown would not be as scary as it really is on server hardware. which means that only clients would be affected.
but since everything runs on the cloud, we basically need to update the whole world.
First off, having dedicated hardware doesn't make these a non-issue, it just makes them less of an issue. If you have dedicated hardware they are concerns. Applications still need to be isolated from each other, and it's a path for someone who gets in as a user to potentially leverage it to root access.
Second, saying everyone should have dedicated hardware is just nonsense. It's like saying everyone should run their own datacenter. For most people neither of those makes financial sense.
well 50:50 chances are high that if a process goes bogus, that you have more problems than just "memory is readable from every process".
(of course that does not apply to clients where code can run in jit (javascript), software that communicates with the internet, etc and runs on a remote machine).
most servers probably should only run trusted code (of course that is mostly never the case because no company evaluates every process they running) but chances are high that if some uses linux and gnu stuff that most shady stuff gets catched. (I mean if not, some people could do a lot of bad stuff, consider a misbehaving systemd, nginx/apache, databases, which could basically do a lot of harm.)
Looking at the graph: probably. It was already increasing (from 16-19%, one outlier to 27%, first 8 datapoints) to around 25-30% (the last 6 before the big jump). Then a big jump to 62-85% (N=20) with most around 75% (N?13). The last bit is around 60-72% (N=12) and the very last datapoint is below 40% again but that is half outside the graph (cherry-picking datapoints?).
So to summarize, from 25% to 65%, or 2.6× the original.
Note that this is all from eyeballing that graph and estimating what percentage a datapoint is at, since the scale has an interval of 20 percent points and no minor grid or anything.
reply