P.K. Hwang

Napkin Math For On Demand Gpus

01/2023 - Phil

Have been thinking about the economics of cloud computing recently as I’ve played around with CUDA kernels on a personal GCP box.

Let’s say I wanted to buy GPUs and monetize them. I’d like to figure out how much I should rent them out for. Roughly speaking, I’d want to figure out how good of a business this is and compare my analysis with products offered by existing cloud providers.

One thing I would like to work out better is why different cloud providers are charging different rates for the same thing. A 40GB A100 costs $2.934/hr on GCP and $4.096/hr on AWS and $1.10/hr on Lambda Labs. Presumably all three of these cloud providers are paying relatively similar prices for their GPUs and yet they offer vastly different prices.

I should note that any calculation is very rough and there are many details I do not know firsthand like what AWS’s occupancy rate is or whether they linearly peg the on-demand price to the market price of the GPU. This is also not intended to be an investment pitch, I’m simply doing some envelope math and thinking about the implications of that math.

GPU Economics

For simplicity let’s examine the economics for if my business involves a single GPU.

Assumptions:

The cash flow I receive each year \(i\) is \(C_i=p_i\times l\times u=p_{i-1}\times (1-d)\times l\times u\) where \(l\) determines how much money we would rent out for that GPU for an entire year and and \(u\) is the fraction of the year the GPU is being used.

Note that we can calculate the IRR (internal rate of return) by solving \(p_0=\sum_{i=0}^T\frac{C_i}{(1+IRR)^i}.\) For simplicity I’ll assume that \(T=9\) (10 years total).

With substitution we can see \(p_0=\sum_{i=0}^T\frac{p_0(1-d)^ilu}{(1+IRR)^i}.\) One thing we should note is that based on this formulation our cash flows are geometrically dependent on the depreciation rate while they are linearly dependent on usage rate and our rental fee. In practice, there are probably some more complicated things going on (higher rental fee results in lower usage rate), but I’ll ignore that for now.

I’ll assume that the depreciation \(d\) is around .2. I roughly estimated this by comparing the release date starting price of a few GPUs with the current price. For example, in 2018 a 16GB Tesla V100 cost around $10,000. Currently it costs around 3800 on Amazon (around 22% depreciation per year).

Now using some basic Python I can calculate the IRR for different sets of variables.

If your target rate is low like \(l=1.25\) (like Lambda Labs) it would take 20% occupancy rate in order to achieve an IRR of 0%. An occupancy rate of 30% gives an IRR of 15%. An occupancy rate of 50% gives an IRR of 40%. An occupancy rate of 90% gives an IRR of 90%.

If you target rate is moderate like \(l=5\) (like AWS and GCP) it would take around a 5% occupancy rate in order to achieve an IRR of 0%. An occupancy rate of 10% gives an IRR of 28%. An occupancy rate of 30% gives an IRR of 128%. An occupancy rate of 50% gives an IRR of 228%.

We can see that for cheap pricing like Lambda Labs, assuming a 22% depreciation, the line between achieving a good return on investment and not is very thin. If the occupancy rate drops to 20%, then we are not making a return (and probably losing money from other expenditures). On the other hand, moderate pricing used by AWS and GCP requires a less than 5% occupancy rate to lose money. Furthermore, the IRR improves significantly as the occupancy rate increases.

My point of this is not to estimate the rate of return these cloud providers are precisely receiving, but to illustrate how the rate of return varies as we change parameters of the economics. I have no knowledge as to how the price might affect occupancy rate or what the customer acquisition costs might be to achieve a certain occupancy rate.

One of the risks I would consider if I were investing in this business is that there is probably large estimation error with respect to these parameters. We don’t actually know for sure that depreciation is going to be 22%. It might be a lot less or a lot more. We also might not know if we can consistently get 50% occupancy. It could fall much lower (say AI winter happens or people figure out how to use much less compute).

My guess is that AWS and GCP perceive this risk and therefore would rather play it safely making a modest return even in the event of lower occupancy or higher depreciation than to push the price to the boundary for growth sake. Lambda Labs on the other hand is willing to subsidize faster growth by pricing much lower hoping that 1) they are able to achieve a much higher occupancy rate than AWS and GCP and 2) that the variables do not diverge much from estimation and 3) in the long run they will be able to differentiate themselves from GCP and AWS through goodwill, brand, etc.

From afar, it would seem that the moderate pricing strategy from incumbents is much more appealing than the low pricing Lambda Labs strategy. One should note that even a 90% occupancy rate of the low price model does not give as good of an IRR of a 30% occupancy rate of the high price model. Given that the tailwinds have been strong, I would guess AWS and GCP investments in GPUs have returned a much stronger return on capital than Lambda Labs. The AWS and GCP model is also much less vulnerable to changes in the occupancy rate or depreciation. Having a 30% occupancy rate already returns back 150% of the money in the first year for the moderate pricing model while it takes more than 4 years to return that money for the low pricing model. If usage rates suddenly plumeted in 2 years, the moderate pricing model would already be in the green while the low pricing model might yield net negative investment and result in insolvency.

One has to remember that a business’s goal is to maximize shareholder value. While it can be a worthwhile strategy to subsidize customer growth with aggressively low prices, low prices can also mean leaving shareholder value on the table. Higher prices are also not necessarily always good as this might come at the expense of long term goodwill to the customer. But I don’t imagine AWS and GCP are really destroying that longterm goodwill by charging more at this moment. One could turn See’s candy into a really terrible business by slashing prices by 80%.

Lambda Labs could be using a strategy where they are fine making less with on-demand services as long as they make up money through other avenues like colocation or selling workstations. I’m ignoring those at the moment and simply looking at the economics of renting out GPUs. At the end of the day, unless on-demand services provide a very high return way of acquiring customers who will send money to the other services, in of itself the cheaper pricing model seems to be a worse place to deploy capital than the moderate pricing model. I’m not saying that Lambda Labs should raise their prices necessarily either; it’s possible they can’t as they are fighting an uphill battle against incumbents. Furthermore, the IRR for moderate occupancy rates for a cheap pricing model is still very good, they just don’t return as fast and aggressively as a moderate pricing model. In the limit one might consider the competitive dynamics between the providers. If it becomes clear that the cheap pricing model is the more economically rational model (and continues to yield a solid return), I don’t see why AWS or GCP couldn’t just adjust their prices to be lower with arguably greater competitive advantages.

In the word of Warren Buffett:

“If you’ve got the power to raise prices without losing business to a competitor, you’re got a very good business. And if you have to have a prayer session before raising the price by a tenth of a cent, then you’ve got a terrible business”