Technically, you pay for both time (hours on) and CPU Usage (instance tier). Its not like different instance tiers (at least in the same class) use fundamentally more or less powerful processors. They all use the same processors, you just get more or less of it depending on what you pay.
Conceptually it is "pay as you use" by CPU usage, but just rounded into buckets by instance tier.
Of course, there's a lot of underutilization within each bucket, because the granularity isn't per 1% used, but (more or less) per 100% used (aka each core). And also, most applications can't switch instance tiers easily to adapt to demand (though some certainly can).
That's true, however a lot of spot usage ends up being heavily diversified across instance types in order to avoid momentary supply issues and optimize cost.
Across different families, CPU performance can vary by a decent amount.
> T4g instances start in unlimited mode by default, giving users the ability to sustain high CPU performance over any desired time frame while keeping cost as low as possible. For most general purpose workloads, T4g instances in unlimited mode provide ample performance without any additional charges. If the average CPU utilization of a T4g instance is lower than the baseline over a 24-hour period, the hourly instance price automatically covers all interim spikes in usage. In the cases when a T4g instance needs to run at higher CPU utilization for a prolonged period, it can do so for a small additional charge of $0.04 per vCPU-hour. In standard mode, a T4g instance can burst until it uses up all of its earned credits. For more details on T4g credits, please see the EC2 documentation page.
I'm on the App Engine team, and I just wanted to clarify one thing: The main difference between CPU hours and Instance hours is that CPU hours are charged based on CPU usage, while instance hours are based on wallclock time. The high ratio between the two you can see with PlusFeed is because it's spending a lot of time to serve each request, most of which is spent doing nothing - likely because it's doing outgoing HTTP requests.
Previously, we had no way to account for apps like this, that take a lot of wallclock time but very little CPU time, and as a result we couldn't scale them well. Under the new model, the charges reflect the real cost here - memory pressure. Every second an instance sits around waiting is a second that the memory occupied by that instance can't be used to serve other requests.
As others have pointed out, we're in the process of launching Python 2.7 support - it's currently in Trusted Tester phase - which will support multiple concurrent requests, and services like PlusFeed are likely to be able to take great advantage of that, reducing their instance hours by a large factor. Likewise, doing asynchronous URLFetches (where that's practical) can cut a huge amount off instance time.
That's also not exactly true. If you purchase a reserved instance, you pay the up front price, and then the reduced hourly price for whenever the instance is running.
For the other instances you get a specific number of units of processing capacity that you can use 100% of continuously if you like. For the micro instances, you get a base level and build up credits towards bursts, and can not maintain 100% utilization continuously. It's very much different and not the default. To quote Amazon:
> "A CPU Credit provides the performance of a full CPU core for one minute. Traditional Amazon EC2 instance types provide fixed performance, while T2 instances provide a baseline level of CPU performance with the ability to burst above that baseline level. The baseline performance and ability to burst are governed by CPU credits."
A t2.micro allows only 10% of the vCPU baseline performance. Anything above that needs to be "earned" at a rate of 6 credits per hour. The t2.micro can accumulate a maximum of 144 CPU credits (+ the 30 initial credits, that do not renew), each good for 1 minute of 100% use.
So in other words, you can on average only use 100% of the CPU for 6 minutes per hour.
Many people seem to be complaining about the shift from CPU-hour to instance-hour pricing. They don't seem to get that it's more or less the same thing: dynamic instances can handle "a small number" of concurrent requests.
So the GAE team has set the incentives correctly, to reward apps that work well with concurrency.
The current free quota is 6.5 CPU hours, and the upcoming free quota seems to be 24 instance hours. I know which I prefer.
It's actually more flexible than that too. You get 1500 hours of small instance credit, but you can apply it to larger instances sizes (for proportionally less time) or other services like "Websites" which I haven't used but is apparently some sort of PaaS.
If you haven't used aws a lot then you might not know this but the old instance types stick around and you can still use them, especially as "spot" which lets you bid for server time.
I had a science project which was cpu bound and it turns out because people bid based on the performance, the old chips end up costing the same in terms of cpu work done/$ (older chips cost less per hr but do less).
aws though was by far the most expensive so switching to like oracle with their ampere arm was a lot cheaper for me.
Not all application resource demands fit nicely inside instance types.
Once you determine your application needs, you can set your cpu/memory resource needs per container/service. Then you can run these containers inside your fleet more smartly.
This really comes into play for memory and cpu ratio. Look at the price difference between:
one m4.2xlarge = 32gb mem, 8 cores = $350 a month
eight t2 medium = 32gb memory, 8 cores = $300 a month
You get twice as much compute for less money with the ability to mix and match service needs instead of being constrained to individual instances.
You're not committing to a usage level, you're committing to a pricing level. If you reserve a small instance, you get a small instance, no matter what utilization level you choose. It's the same exact resources no matter what you pick.
Medium utilization = medium upfront cost, medium hourly rate.
Heavy utilization = highest upfront cost, lowest hourly rate.
The names are meant to signify the trade-off you're making. If you run your instance only an hour a day, you will pay the least by choosing "light utilization": the hourly cost is high but you're only going to multiply that by a small number, so the savings in the up-front cost will dominate the total cost. If you run your instance 24 hours a day, then the hourly rate will dominate your total costs, so you'll save money by choosing "heavy utilization" with a higher up-front cost but lower hourly cost.
Segmenting the costs makes the pricing table more difficult to read, but it optimizes for everything else: you pay the lowest possible price for guaranteed resources, and Amazon has better knowledge of how much spare capacity it actually needs to handle the reservations.
Quote from the page above
"
Light and Medium Utilization Reserved Instances also are billed by the instance-hour for the time that instances are in a running state; if you do not run the instance in an hour, there is zero usage charge. Partial instance-hours consumed are billed as full hours. Heavy Utilization Reserved Instances are billed for every hour during the entire Reserved Instance term (which means you’re charged the hourly fee regardless of whether any usage has occurred during an hour). "
Conceptually it is "pay as you use" by CPU usage, but just rounded into buckets by instance tier.
Of course, there's a lot of underutilization within each bucket, because the granularity isn't per 1% used, but (more or less) per 100% used (aka each core). And also, most applications can't switch instance tiers easily to adapt to demand (though some certainly can).
reply